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1. EARTHQUAKE-EXPLOSION SEISMOGRAM DISCRIMINATION 

1.1. Introduction 

Automatic discrimination of weak explosions and earthquakes at local 
distances is an important problem for CTBT monitoring in regions where commercial 
mining operations and quarry blasting generate a large number of seismic recordings 
on a daily basis. Development of reliable discrimination techniques to solve this 
problem involve selection of regionally dependent parameters (which have to be 
automatically extracted from seismograms) and estimation of misclassification 
probabilities intrinsic for a given region. 

Discrimination between weak regional or local earthquakes and explosions 
remains a serious challenge for state-of-the-art CTBT monitoring. In particular, the 
known fact that discrimination features proved to be efficient in one seismic region 
are often useless in another region is a serious obstacle for refinement and 
standardization of source discrimination techniques. Another important unsolved task 
in seismic discrimination is to estimate the probability of event misclassifications 
inherent to the seismic region under investigation. 

In recent “years the success has been achieved in techniques for discriminating 
between earthquake and explosion sources by artificial neural networks. Nevertheless, 
the potential of the conventional statistical discrimination approach is not exhausted 
yet. Two general statistical approaches may be distinguished in the seismic 
discrimination problem [1,2]: 

L Waveform discrimination approach: employment of the statistically optimal 
technique for testing hypothesis for discrimination between explosion and earthquake 
seismograms. The latter is modeled in this approach as realizations of zero mean 
stochastic processes with unknown but different power spectral densities (PSD). The 
discrimination is based on the well known fact that typical normalized PSD of P and 
S phases are different for explosions and earthquakes. This difference is due to 
distinction of physical nature of the sources. The various divergence measures between 
stationary stochastic processes (used for modeling explosion and earthquake 
seismograms) were proposed in [1,2] to construct the hypotheses testing statistics. 
There are the following measures: quadratic informant, minimum discriminative 
information, e-entropy Renyi, Kullback-Leibler J divergence, Chernoff quasi-distance 
and so on. The summary of this approach is presented in [20]. 
























2 


Practical application of the above approach to seismic source discrimination 
demands estimating PSD of P and S phases belonging to different classes using 
learning sets of explosions and earthquakes. These estimates have to be substituted 
into classification algorithms instead of unknown PSD of stochastic processes which 
are modeling the different classes. Our experiments showed that seismograms of weak 
local earthquakes and explosions demonstrate considerable diversity of spectral 
estimates. Therefore, simple averaging of spectral estimates calculated for learning 
seismograms belonging to the same classes inevitably should result in great smoothing 
which leads to poor classification capability. This did not encourage us to use the 
approach recommended in [1,2] for discrimination of weak local events. 

2. Feature extraction - feature discrimination approach: statistical 
discrimination of relevant numerical features extracted from seismograms according to 
some heuristic considerations. Some power or spectral numerical characteristics of 
seismograms which are typically different for earthquakes and explosions are used as 
the discrimination features. The discrimination problem is solved by application of 
classical statistical pattern recognition technique to processing of the feature sets. 
Numerous investigations in discriminate analysis [3-6] demonstrated that selection of 
a small number of most informative features is extremely useful in this approach. It 
was proved that few carefully selected features may provide a smaller error 
classification probability as compared with the all set of the features. This is so-called 
"pick-effect" or "multivariate effect". Of course, the optimal feature subset selected for 
given region data is able not be the same as one for another region. But automatic 
feature selection procedure allows to select the individual optimal subsets of features 
for different regions of interest. 

Note, that there exists a vast amount of statistical nonparametric pattern 
recognition methods (Parzen's "kernels", "potential functions" methods and other 
[10]). Various unstatistical methods of classification such as "nearest neighbor" and 
"k-nearest neighbor" rules, are also frequently used in different applications [10] The 
logical classification algorithm "KORA” and his numerous modifications are widely 
employed for predictions of future earthquake epicenter locations [21]. Nevertheless at 
our opinion, parametric statistical approach is the most appropriate in our problem, 
because the criterion of classification quality can be most generally formulated in 
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terms of the probability theory and error classification probabilities can be rather 
easily calculated. Besides: 

a) . The parametric statistical classification theory is well developed. In 
particular, it allows to obtain under quite realistic assumptions the theoretical 
estimates of error classification probabilities and to perform the theoretically optimal 
feature selection; 

b) . Often the optimal parametric statistical algorithms are coincide with ones 

resulting from some heuristic consideration; 

c) . The parametric statistical algorithms are rather simple for realization as 

automated computer tool for different monitoring puiposes. 

The theoretical assumption concerning to multidimensional Gaussian (normal) 
distribution of the feature vectors used for theoretical substantiation of many 
parametric statistical algorithms is not so severe due to possibility to implement a 
feature nonlinear transformation of the vectors (as known Box-Kox s transformation 
used in this research) which leads to normalization of their distributions. The 
statistical methods mentioned above are not widely used until present time in seismic 
monitoring practice for selection of appropriate earthquake and explosion 
discrimination parameters and for ensuring of reliability of earthquake-explosion 
discrimination in a given region. 

In this study we made an attempt to implement the parametric statistical 
classification methods for discrimination between weak earthquakes and chemical 
explosions recorded by Israel local seismic network. A learning set of 28 earthquakes 
and 25 explosions with magnitudes 1.1-2.6 recorded at distances of 30-200 km. was 
used in the study. The events were recorded by the Israel Seismic Network (ISN) 
which is operated by the Seismological Division of the Institute tor Petroleum 
Research and Geophysics (IPRG), Israel. The data base supplied with ground truth 
information was collected by Dr. Y.Gitterman (IPRG) and used in [19|. The wave 
forms and ISN bulletin information were prepared and transferred by Dr. V. Pinsky 

(IPRG). 

A variety of discrimination parameters based on the relative power spectral 
distributions of P and S phases was extracted from the event wavetrains and processed 
by special statistical procedures with purpose to select the most informative featuies 
and attain the minimum of probability of discrimination errors. Estimates of the 
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power spectrum of seismic noise at intervals preceding the event wavetrains were used 
to improve the quality of discrimination feature measurement in conditions of poor 
signal-to-noise ratio which are typical for weak event recordings. 

The feature selection procedures allowed us to extract by an automatic 
procedure the 5 most informative features from more the 20 seemed to be relevant for 
discrimination from heuristic considerations. Implementation of noise refinement and 
feature selection procedures, feature nonlinear transformation and quadratic 
discrimination allowed to achieve the average misclassification probability (estimated 
with the help of a statistically consistent cross-validation method) equal to 3.8%: only 
2 events (explosions) were incorrectly classified from 53 events under study. 

An automated seismogram discrimination technique was developed to provide 
implementation of the proposed algorithms in seismic monitoring practice. It was 
designed with the help of Seismic Network Data Analysis (SNDA) System, a 
problem-oriented programming shell developed at the Moscow IRIS Data Analysis 
Center/SYNAPSE Science Center [7] in which the program package for statistical 
identification was incorporated 

1.2. Overview of theoretical methods for statistical classification ^feature 

selection, and error probability estimation 

1.2.1. Statistical approach to the classification problem. 

There exists an extensive bibliography on the statistical methods of 
discriminant analysis. The reviews on this problem may be found in [5,8-12]. Below 
we give the very brief sketch of statistical approach to the discriminant analysis. 

Let us consider the following parametric model of a set of p -dimensional 
distributions in ^-dimensional feature space R p ; let the set can be parametrized by k- 
dimensional parameter 0: {P Q ; 9g0c^ }. Denote the corresponding set of probability 
density functions (p.d.f) as {f(.x;Q); QeQaRfJ. The p.d.f. f(x;Q) can be regarded as 
known function of argument x and parameter 0. The two distributions PI and ?2 with 
corresponding p.d.f. f(x;Qj) andy(jc;02^ is the probabilistic model of some two classes 
of feature vector variations.. The learning vector samples X n j = {xj(l),xf l),...,x n j( 1)} 
and X n 2 — {xj(2),xf2), ... ,x n f 2)} are regarded as the mutually independent random 
vectors with p.d.f. /(x;0 j) and f(x;^f) respectively. The vector x to be classified is 
regarded as random vector independent from the learning samples X n j and X n 2 and 
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having p.d.f f(x;Qo) where parameter 0^is unknown but can be equal only to Bj, 0^ . 
The discrimination problem is then be interpreted in the terminology of theoretical 
statistic as the testing of composite statistical hypothesis on the basis of the 
observations X — {x,X n j f X n J^ The hypothesis are: Hp 0p = H 2 : Qo = 0 2- This 

most general so-called three-samples statistical interpretation of the classification 
problem was suggested by C.R.Rao in 1954 [13]. There are few strategies and 
approaches for this problem solution[ll], such as Bayesian, maximum likelihood 
ratio, minimax, adaptive and so on. We briefly discuss below some of this approaches. 

In the Bayesian classification approach 0/, and 0^ are regarded as independent 
random variables with a’priory probability density functions p(Qj) and p($2) 
correspondingly. The learning sample sets X n j,X n 2 are treated as sequences of 
independent observations of vector variables from R p with conditional p.d.f. f(x\Qj) 
and f(x\& 2 )> where fixed Qj f 62 are sample values of random variables from, Rk with 
p.d.f. p(Qj) and p(Q 2 )- The vector jc to be classified is regarded as a sample value of 
vector variable with conditional p.d.f. /(x\Qq) where Qq is a sample value of random 


variable from Rjc with unknown a’priory distribution: p(Q j) or p(§ 2 ), every can exist 
with a’priopy probabilities Pj. and P^l-Pj. It is known that the statistic of Bayesian 
discrimination rule (which minimizes the average Bayesian risk) has the form of 
averaged likelihood ratio ([11] eq.(32)). In the important particular case where f(x\Qi), 
1—1,2 are the multivariate Gaussian distributions with random mean vectors 

and \i 2 having different Gaussian distributions and random matrix T= having 
the Wishart distribution, the Bayesian discrimination rule can be calculated 
analytically ([12] formula (57)). For practical applications of this rule the a’priory 
information concerning parameters of a’priory distributions of p/ and 7" is required. If 
this information is absent, the asymptotic form of the Bayesian rule may be used for 


natural asymptotic where this a’priory information "disappears”, i.e. 


dispersions of all 


a’priory distributions tend to infinity and Pj, P 2 tend to 7/2. This asymptotic 


Bayesian statistic has the form ([12], eq.62); 
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T~> / ^2 \ T' / ^7 P ) i 

b = —- Ajfb x 

p ^2 ~ P J T(-) n 2 + l I ^21 
2 2 

i n 0) 

[1 + — l — (x - x (I) f IT , 1 (x - x (1) ) ] 2 
rtj + 1 _ 

[i+(x - x (2) / ir 2 (x - x (2) ) y T 
n 2 + l 

where: x (1) x <2) and E ; S 2 are sample means and sample covariance matrices of the 
learning sector sets X n j, X n 2 , |£/ I ,1=1,2, are the discriminants of the sample 

matrices. If Bas>l, the decision is made that hypothesis H 2 is true; if B as < / , the 
decision is made that the opposite hypothesis Hj is true. Note, that only learning 
vector data X n j, X „2 and being classified vector x are used in equation (1) 

The maximum likelihood discrimination rule has the form ([111, p-24): 
if r>l, the hypothesis H 2 is adopted 
if r<l, the hypothesis Hj is adopted, 


where 


sup f (x\Q 2 )f (X nl \Qj)f (X n2 \^ 2 ) 
) } ___ 

sup f(x\bj)f(x nl \$j)f(X n 2 \e 2 ) 

)j,e 2 Ae®e 


In the case where f(x;Qj) and f(x;Q 2 ) are the multivariate Gaussian distributions 
N {\ii,S}, 1=1,2, with a common covariance matrix S the statistic r in eq.(2) coincides 
with the well known Anderson's maximum likelihood statistic M ([12] p.46, eq.(29)): 

M = -2- (x - / S-; +n2 (x - x w f - (x - x (2) f S;L 2 (x-x< 2 >)\ 

tij + 1 n 2 + 1 

where S~j +n2 is unbiased estimate of the common covariance matrix of observations 


S using total learning vector set. 

In the case where f(x;Qj) and f(x$f) are the multivariate Gaussian distributions 
K {\i }, 1=1,2, with different covariance matrices Sj and £2 statistic r in eq.(2) is 
defined by formula ([12] eq.(30)): 
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s,P P+ 7 £j 7 (*-* , ’rv(*-* , ’>J ’ 

1 n2+l 

z 2 | 2 [1 + - — r ^-—j(x-x (2> ) T ir 2 , (x-x (2) )] 2 

(n 2 + 1) 

(3) 

In accordance with the minimax strategy one has to find the discrimination 
rule that provides a minimum (in comparison with any other rule) of a risk function 
P(®1>®2) maximum value attained for all possible 67,62 e©. This strategy is very 
difficult for realization. In the case of multivariate Gaussian distributions N/j ij,S}, 
1=1,2 with a common and known covariance matrix S the minimax discrimination 
rule was derived in 1973 by Das S. Gupta [#■]. He found that in this case the 
minimax statistic coincides with the Anderson's maximum likelihood statistic (3) 

where 1 = 1 , 2 , have to be substituted by the known covariance matrix S. 

Adaptive ("plug-in") discrimination rules get historically the most development. 

In particular, the adaptive rules based on likelihood ratio calculated for known 
parameters of observation X = {x, X n j ,X n 2 j distributions under hypothesis Hi and H 2 , 
are most commonly used. Adaptive rule of this type has the form: 

A 

if ln( L (x))>0, the hypothesis H 2 is adopted; 

A 

if ln( L (x)) 0, the hypothesis Hj is adopted, 

A A A A A 

where L (x) =f(x; 0 / )/f(x; 0 2 ), 0 j , 02 are consistent estimators of 6 j, 6 ^ 

Adaptive rules of this type usually provide rather simple and relevant in practical 
applications discrimination algorithms. From the theoretical point of view they are 

A A 

asymptotically optimal (under weak restrictions) if estimators 0 / , 0 2 are 

consistency ones. It means that minimum of misclassification probability is 
guaranteed when amounts ni,n 2 of learning vectors tend to infinity and assumed 
conditional p.d.f. f(x;Q) corresponds to real distribution of observations belonging to 
different classes if parameter 6 get some values 6 /, 62 - 

In the case of multivariate Gaussian distributions K {\L/,S}, 1=1,2 with a 
common covariance matrix S we get the well known discrimination rule based on the 
linear discriminant function (LDF) 

If LDF > 0, the hypothesis H 2 is adopted; 

If LDF < 0 , the hypothesis Hj is adopted, 


n 


p(nl+l) 


rij + 1 


(Hill) 

rio 


p(n2+l) 
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where 

LDF = fx-l (x (2> + x (1) / S;j +n2 (x <2> -x (,) / ]. ( 4 ) 

In the case of multivariate Gaussian distributions X{[ii,SJ, 1=1,2 with different 
covariance matrices Sj and S 2 we get well known discrimination rule based on the 
quadratic discriminant function (QDF) 

If QDF > 0, the hypothesis H 2 is adopted; 

If QDF < 0, the hypothesis Hj is adopted, 


where 

QDF = (x- x (1> / S-J (x - x (1 > )-(x- x (2 > / S-J (x - x (2 >) + In if* 

^n2 



It was proved in [12], that in the case of multivariate Gaussian distributions N/jr/,*Sy, 
1 =1,2, with different covariance matrices Si and S 2 the asymptotic Bayesian 
discrimination rule (1), maximum likelihood rule (3) and rule founded on QDF (5) 
are asymptotically equivalent if amounts of learning vectors nj,H2 tend to infinity. The 
same is valid for the rules (2) and (4) based on the Andreson M-statistic and the LDF 
(for the Gaussian case with common covariance matrix). 

In our experimental studies we used the discrimination rules based on the LDF 
and QDF because these rules are asymptotically optimal in the case of multivariate 
Gaussian distributions and are easy for realization. Besides, these rules are optimal in 
the following heuristic sense: the LDF and QDF compare /7-dimension distances from 
the vector jc being classified to the centers of classes determined by the sample means 
and x (2) , and assign the vector x to the class corresponding to the minimal 
distance. The distances are calculated for generalized Euclidean metrics defined by 
the matrices S~j and S~ 2 2 (which are equal in the LDF case). The restriction for 

practical application of LDF and QDL, connected with the necessity of 
multidimensional Gaussian (normal) distribution of observations can be relaxed by 
the nonlinear transformation of the features: there exist some transformations (for 


example, the simplest ones: y=log(x) and Box-Kox’s y=<x(x a -l)) which being applied 

to the discrimination feature make their distribution close to Gaussian one. 

In practical application of the LDF and QDF some computational difficulties 
may occur while inverting of sample matrices S n j+ n 2 , S n j, S n 2 ~ the latter can be bad- 
posed if the some features are strongly correlated. These difficulties can be overcome 
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by the regularization of the sample covariance matrices, for example, by substituting 
their diagonal elements Sjj by the values s } rfl+p) where p is the coefficient of 
regularization chosen from some heuristic considerations. 

1.2.2. Feature selection. 

The feature selection problem in the statistical discriminant analysis is a part 
of more general problem of the multivariate statistical analysis connected with 
decreasing of dimension of random variables being processed. The different dimension 
decreasing methods are used in the discriminant analysis such as regression and 
correlation analysis, method of canonical variables, principal component method, 
method of two discriminant directions and so on [10]. The main idea of these 
approaches is to pass to some space of linear combinations of input variables which 
has a smaller dimension. The lack of these methods is difficulties connected with a 
“physical” interpretation of used linear combinations. The simplest alternative method 
for dimension decreasing used in the discriminant analysis is selection of a subset of 
the “most informative features” from the total feature set. At the early stage of 
discriminant analysis development it seemed that increasing of amount of features 
improves the classification quality or do not make it at least worse. Then it was 
realized that this assertion is correct only for situation where all parameters of feature 
distributions are a’priory known for every class, but as a rule, it is incorrect for 
situation where the unknown parameters have to be estimated using learning feature 
vector sets with bounded sample size. Numerous investigations in discriminant 
analysis [3-6] demonstrated that few carefully selected features may provide a smaller 
error classification probability as compared with the all feature set. This is so-called 
"pick-effect” or "multivariate effect". The effect may be explained in the following 
manner: the number of parameters to be estimated using bounded learning sets rapidly 
(often in quadratic manner) increases together with rising of the amount of features 
used; random variations of the estimated parameters are strongly correlated and as a 
rule, are summated during the procedure of discrimination statistic calculation that 
gains the random variations of the statistic; this results in growth of the classification 
error probability. From the other hand, increasing of the feature amount k leads to 
growth of the Kullback-Leibler's distance between the feature multidimensional 
distributions corresponding the two classes. This results in decreasing of the 
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classification error probability. These two tendencies operating in the opposite 
directions allow to expect an appearance of minimum in a course of the error 
probability curve P(k) while increasing of k. The more thorough investigations 
revealed that this minimum is realized when the Kullback-Leibler's distance growth 
with rising of k is exhausted for large values of k. 

The quantitative analysis of the “pick effect” is the very difficult mathematical 
task so derivation of an analytical expressions for the error probability curve P(k) was 
accomplished only for few simplest cases, in particular, for the case of two 
multivariate Gaussian (normal) feature distributions with equal covariance matrices. 
For these distributions the Kullback-Leibler’s distance coincides with the 
Makhalonobious distance D(k). An asymptotic expansion for the average error 
probability Pi (k) provided in this case by the linear discrimination function (LDF) 
was studied by A.D.Deev [3]. He derived the rather simple asymptotic formula 
(Kholmogorov-Deev formula) for P(k) in the asymptotic where sizes of learning 
vector sets nj and 112 and amount of features k tend to infinity while k/tij, , k/ti2 
and Makhalonobious distance tend to constants This asymptotic is very relevant for 
describing of typical practice situation where amount of features gathered for 
discriminant analysis is commensurable with amounts of learning feature vectors 
available. The idea to apply this asymptotic for investigation of discriminant analysis 
quality was put forward by A.N.Kholmogorov and it appeared to be the veiy 
productive idea used in numerous further investigations. 

The Kholmogorov-Deev formula was used in our study for design of step-wise 
feature selection procedure. This procedure generates an optimal feature subset 
consisting of ko features providing the error probability equal (or close) to minimum 
of Pi (k) probability cuive. The procedure is recurrent one and allows to avoid a 
exhaustive search through the all feature subsets. The few types of step-wise 
(recurrent) feature selection procedures can be constructed using the following 
selection strategies: one by one addition of features, one by one elimination of 
features, step-wise addition of one with elimination of another feature and so on. In 
this study we used the step-wise procedure with one by one feature addition. This 
procedure is comparable with the exhaustive search procedure in reliability of 
attaining of the Pi (k) minimum but significantly less time consuming. 
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Feature selection is based on processing of learning vectors {x/ (j); je (l,n / ), 
le(l, 2 )} where 1 is the number of class; «/ is the number of vectoxs in the learning 
set. Initially each vector consists of p features gathered as l'elevant for a given 
discrimination problem. The feature selection procedure consists of p steps. At any 
intermediate step k only some k<p features are involved, so the xi 0 ) are the k- 
dimensional feature vectors at this step. We use as a base for the feature selection the 
informational distance between two ^-dimensional probability distributions of learning 
vectors which is called the Makhalonobious distance 

Rflc)= (m(k, 1 ) - m(k,2)) T S' 1 (k) (m(k f l) - m(k,2))> (6) 

where: m(k,l), m(k, 2 ) are the sample mean k feature vectors for the 1-st and 2-nd 
classes; S~ 2 (k) is the (kxk) inverse sample covariance matrix calculated for these 
features using learning data for the both classes. 

At the first step of selection procedure p values of the R(l) distance are 
calculated for every feature. The maximum from these p values is attained for some 
j(l) feature which is thus selected as optimal one. At the second step p-1 values of the 
R(2) distance are calculated for the feature pairs. The first member of every pair is 
always the previously selected feature with number j( 1 ), the second member is any rest 
feature. The second optimal feature is selected as providing a maximum among these 
R(2) values. At the £-th step of selecting procedure p-k+1 values of the R(k) distance 
are calculated for the vectors composed by the k features The first k-1 components of 
these vectors are the optimal features which have been selected at the previous steps, 
the k- th component is any feature from the remaining p-k +1 ones. 

The procedure described rearranges the initial order of features in the learning 
vectors to provide the most rapid increasing of the Makhalonobious distance. To 
select the most informative subset of the features the estimate of misclassification 
probability Pp(k) is calculated for every step k (k = 2,..,,p) of the procedure using the 
Kholmogorov-Deev formula [3]: 

P L (k) - (1/2)[1 - Tk(R(k)/o(k)) + T k (-R(k)/ offl (7) 

where 

o 2 (k) = [(t + l)/t][)’i+r 2 +R(k) ]; t = [(rj+rj/rjrrf-l; ri=k/n L ; r 2 =k/n 2 
T k {z) - F(z) + (l/(k-l) ) (a j - a 2 Hj(z) + ajH^z) - a 4 Hj(z)) f(z), 
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F(z) - is the cumulative function of standard Gaussian probability distribution ; f(z) is 
the density of this distribution; Hj(z) is the Hermitian polynomial with order /, 
i-1,2,3; aj , j=l,...,4 are some coefficients depending on k, nj, ti 2 and R(k) [3]. 

The selection method is based on the theoretical interpretation of the fact that 
in practice the Mahalanobious distance R(k) is a monotonically increasing function of 
k with growth tending to be exhausted while k p. For this condition the function 
Pjfk) has a minimum at a some step ko between 1 and p. Thus in result of a 
procedure execution one gets the set of optimal (most informative) features with 
initial numbers j(l), j( 2 ),..., j(ko), i.e. the set of features selected at the steps 

Note that such selected optimal features provide the absolute minimum of 
average misclassification probability only in the case where the multivariate feature 
distributions are the Gaussian ones with equal covariance matrices and discrimination 
itself is performed with the help of linear discrimination function. If these 
assumptions are not valid there are no theoretical guarantee that this set is the best 
solution. Nevertheless numerous tests, practical applications of the procedure 
described and some heuristic consideration ensure us that the optimality of the such 
selected feature set is preserved outside of mentioned confining assumptions. However 
the estimated value of the PlW minimum seems to be, as a rule, excessively 
pessimistic. So the real misclassification probability provided by the employed (not 
necessary linear) discrimination function operating with the optimal feature set has to 
be independently estimated with the help of statistically consistent procedures. Such 
procedures are discussed below. 

The feature selection algorithm explained above is realized in the computer 
program "fsel " 

1.2.3. Estimation of misclassification probability. 

At least four types of misclassification probability are used in the dicriminant 
analysis. They are the following ones: 

Pb - this is the Bayesian misclassification probability: absolute minimum of 
misclassification probability value provided by Bayesian decision rule under 
assumptions that the feature multivariate distributions a’priory completely known but 
a’priori probabilities of classes Pq , 1 =1,2 are known (equal 1/2, for example) j 

P=Pd P 21 +PC 2 P 12 (Pci~Pc 2 ~l/ 2 , for example) - this is the conditional 

misclassification probability provided by some fixed discrimination function (decision 
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rule) under condition of fixed learning data. Here P 2 i> P 12 are the conditional 
misclassification probabilities for classes 1 and 2 accordingly. Note, that P is really a 
random variable; 

P as ~ this is the asymptotic (limit) P value for infinitely increasing sizes of the 
learning sets: nj, ti 2 -*» 

Pav~P Cl Pav21 + Pc2 Pavl2 ~ EfPJ - this is the average P value: the 
mathematical expectation of random P value using distributions of the learning 
vectors. Note that it is calculated for some fixed decision rule. In the asymptotic nj, 
«2 P^ tends to Pas for the any reasonable discriminant function. However, for 
not so large nj, 112 values of P and P^ can significantly differ. 

The average misclassification probability Pqv seems to be the most relevant and 
convenient one for characterizing the discrimination procedure quality: though it is 
related to the discriminant function used for decision making, it is independent with 
respect to the set of learning observations. However, the calculation of P^ for a given 
discriminant function is much more difficult mathematical task as compare with the 
P calculation. The Kholmogorov -Deev formula gives the asymptotic approximation 
for the average misclassification probability P&, provided by linear discriminant 
function under assumption that feature distributions are Gaussian ones with equal 
covariance matrices. It serves as a basis for design of the fast and effective procedure 
for feature selection [4,5]. However, due to the asymptotic method of its derivation it 
often gives higher probabilities of misclassifications than those really achievable in 
simulation experiments with mediate amounts (several ten) of learning vectors. 

The more or less realistic estimates of misclassification probability are provided 
by different modifications of the frequency ratio method. The essence of the method 
is very simple: some part of learning set is used for adjusting (learning) of 
discriminant function, the another part (possibly crossing with the first) is subjected 
to classification by the learned discriminant function; the fraction of misclassified 
vectors with respect to the total number of tested vectors is used as an estimate of 
misclassification probability. Let us denote: as m(j) - the number of vectors from class 
j\ subjected to classification by the discriminant function; as m(i\j) - the number of 
vectors from class j classified as belonging to class /; iJ—1,2. Then v(i\j)~ m(i\j)/m(j) is 
the frequency ratio of misclassifications for class j; v=(l/ 2 )(v(l\ 2 )+v( 2 \l)) is the total 
frequency ratio of misclassifications for the both classes 
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The next modifications of frequency ratio method are most often used, 

a) Reclassification method (R-method). By this method the adaptation 
(learning) of discriminant function is made using all learning vectors being available 
Then the same vectors are classified with the help of this discriminant function.. The 
disadvantage of the R-method is that the frequency ratio v is in this case the biased 
estimator of the P, and P m values: it gives as a rule the too optimistic values for 
misclassification probability (even asymptotically - for infinitely increasing sizes of the 
learning vectors). The bias of this estimator may be decreased by the boot-strap 


method. 

b) Sample-Check method (C-method). By this method the eveiy class learning 
vector set is divided in two parts. The first part is used for adaptation (learning) of 
discriminant function and the second part - for classification by this discriminant 


function. The frequency ratiov for C-method provides a consistent estimator for P, 


Pas and Pav values: v—>Pas- when nj, ti 2 —> The disadvantage of C-method is that 
the learning data is used not so economically: being divided in two parts their size in 
practice became rather small to guarantee as the good adaptation (learning) of the 
discriminant function as the good misclassification probability estimation by the 


frequency ratio method. As result, the latter is subjected by serious statistical 
variations if the numbers nj and «2 of learning vector sets are not sufficiently large. 

c) The cross-validation (Jack-knife) method (CV-method). The most realistic 
estimate of P av is provided by examination of the learning vector sets with the cross¬ 


validation procedure [18]. In our experiments, we used the described above linear 
discrimination function (LDF) and quadratic discrimination function (QDF). for 
making the classification decisions. It is, of course, possible to implement more 


sophisticated statistical discrimination rules or artificial neural network algorithms. In 


all cases the cross-validation algorithm became the same as described below. 

At the eveiy step of cross-validation procedure one of the learning vectors x/j), 


j=l,nj , 1=1,2 is eliminated from the learning vector set. The remaining vectors are 
used as the data for LDF or QDF adaptation (learning). The eliminated vector is then 
classified by thus learned LDF or QDF. If this vector is classified incorrectly, i.e. 
attributed to a class 2 instead 1 or vice versa, the appropriate count m(l\2) or m(2\l) 
is increased by one. The eliminated feature vector is then returned to the learning data 
set and the next vector xi(j) is extracted. This procedure is repeated with the all (n / 
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+ti 2 ) learning vectors. The values v(l\ 2 )= z m(l\ 2 )/n 2 , v(2\)l z=z m(2\])/nj, v= 
(v(l\ 2 )+v( 2 \l ))/2 are asymptotically unbiased and consistence estimates for Pq\>i 2 > 
Pav21 and Pav for a wide class of feature multidimensional distributions satisfying to 
some weak restrictions [18]. 

The LDF and/or QDF values for both classes produced by the cross-validation 
procedure can be ranked in magnitude. The two ranked LDF and/or QDF sequences 
allow to investigate the physical reasons for misclassifications due to outliers of the 
feature values. 

1.3. Seismogram processing for event source identification: feature extraction, 
feature selection and misclassification probability estimation. 

(scheme of experimental study). 

One of the goals of this study was a development of flexible automated 
interactive technique for processing of weak seismic event recordings provided by a 
local or regional seismic network with the purpose of source type identification. The 
prototype of such technique was tested using the SNDA System data handling and 
interactive graphic facilities and consisted of the next stages: 

1) Mapping of relative location of event epicenter and network stations, 
visualization of event seismograms recorded by the stations; interactive measurement 
of distances from epicenter to stations and ordering the seismograms at the screen 
according to epicenter distances. 

2) Selection of the best quality seismograms for the further identification 
processing. Two seismograms were selected from the total seismogram set for eveiy 
event: one recorded by a "nearby” station and another by a "far" station. Thus, four 
sets seismograms were formed: " near earthquakes ", " far earthquakes ", " near 
explosions ”, " far explosions The puipose of this selection was to investigate the 
influence of epicenter distance on identification of weak local events: from one hand 
the small epicenter distance guarantees the high signal to noise ratio, but from 
another hand at the small distances the wave field characteristics of different wave 
phases are not ascertained due to overlapping of phase waveforms. 

3) Interactive setting at every selected seismogram time intervals for which the 
P and S phase waveforms have to be processed with the purpose of discrimination 
feature extracting. 
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4) Automatic measurement of discrimination spectral feature set from all 
earthquake and explosion seismograms. 

5) Improving the quality of feature estimation in condition of poor signal-to- 
noise ratio by evaluation of the seismic noise power spectral density at interval 
preceding the event wavetrain. 

6) 3-dimensional visualization of feature vector data with the purpose to study 
the scattering of the features for earthquake and explosions and their separating in 
distinct clusters. The visualization is performed for various triplets of the total feature 
set and allows to select those features which provides the best visual separation 
between sets of seismogram and explosion points in 3-dimensional subspaces of 
feature space and make a decision to apply some nonlinear transformations of the 
features with the puipose to decrease their scattering degree. 

7) Feature transforming by some nonlinear function with the puipose to make 
their distribution more close to the Gaussian one. 

8) Manual and automatic selecting of a feature subset from the initial feature 
set to provide the minimum of misclassification probability attainable in the given 
region (while using the statistical identification algorithms) 

9) Estimation of error classification probabilities provided by different 
classificators with the help of reclassification and cross-validation methods using 
learning earthquake and explosion observations. 

10) Identification of "unknown” events from this region which have not been 
used for learning of classification algorithms (if they are available) with the puipose of 
revision of error classification probability estimate made by the cross validation 
method. 

The stages 1-7 are really the preliminary seismogram processing providing the 
feature vector sets as proper input data for statistical (or other) classification 
algorithm. The stages 7-10 are the classification procedure itself with accompanying 
data manipulations. The preliminary processing steps (stages 1-7 ) are illustrated by 
Fig.l. The stages 1, 2 were performed with the help of special scripts (with names: 
"selearthq.scr" and "selexpl.scr”) - the programs written in internal language of the 
SNDA System. The scripts are similar and differ only by the tables of event names; 
the code of the "selexp.scr ” program is given in the Appendix 1 to this section. 
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The stages 3-6 were performed with the help of the SNDA scripts 
“earthspemeas.scr”, explspmeas.scr”, “fftearthmeas.scr”, “fftexplmeas.scr”. The code 
of the 1-st script is given in the Appendix 2, the other scripts have the same structure 
and differ from 1-st only by the tables of event names and some processing routines. 
The measurements of spectral features of seismogram P and S phase waveforms were 
performed in this study by two competitive computational methods: 

a) by filtering of P and S phases in the frequency bands: DO—(1-15)Hz, 

Dl=(l-3)Hz, D2=(3-6)Hz, D3^(6-10)Hz, D4=(10-15)Hz with the subsequent 

calculating of the following spectral ratios: 

avp(P, Di)/avp(P, DO); avp(S, Di)/avp(S, DO); i=l,...,4; 

avp(S,Di)/avp(P,Di); maxp(S,Di)/maxp(P,Di); i-0, 

where av(*,Di) and max(*,Di) denote the average power and peak power of the 
corresponding phase waveforms in the corresponding frequency bands. The 18 
features are calculated by this method with the help of the scripts "earthspmeas.scr " 
and "explspmeas.scr" 

b) .by discrete Fast Fourier Transforming (FFT) of P and S phase waveform 
with the subsequent calculating of the following spectral ratios: 

avp(P,Di)/avp(P,DO); avp(S,Di)/avp(S 9 D0.)', i=h-A; 

avp(S,Di)/avp(P,Di); Maxsd(S, Di)/Maxsd(P,Di); f max (P,Di);f max (S,Di); i=0,... f 4 
where: Maxsd(S,Di)/Maxsd(P,Di) denotes the ratios of power spectral density peak 
values for S and P phases in different frequency bands; f max (P,Di), f max (S,Di) denote 
frequencies inside the frequency bands Di for which the peak values of P and S 
phases power spectral densities are attained. The 23 spectral ratio features are 
calculated by this method with the help of scripts "ffteartmeas.scr" or 
"fftexplmeas.scr ". 

Note that the features calculated with the help of both methods are relative 
ones i.e. do not depend on event magnitudes or recording scale factors. 

To make the feature measurement procedure more robust to the affecting of 
seismic noise, the values of average noise power were measured in the same frequency 
bands Di in a noise window preceding the P-wave onset. The noise power values are 
subtracted from the corresponding signal phase power values. This ensures more 
precise feature measurement of an event wavetrain even if the signal-to-noise ratio in 
the event recording is poor. Thus, two variants of feature sets are created: with 
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subtracting of noise power values ("pure” variant) and without this subtracting ("noise" 
variant). 

As the result of accomplishing of the processing stages 1 - 5 the eight feature 
data sets are prepared for earthquakes and explosion seismograms according to 
variants: 1) "pure" ("p") or "noise" ("n") variants; 2) "Far" ("F") or "Near" ("N”) 
events; 3) Type of feature extraction: filtration in the frequency bands ("p" or "b") or 
the fast Fourier transformation ("s") (see Fig.l and Fig.2). As an example, the list of 
18 feature labels obtained with the help of "filtration" method for "pure" variant and 
the list of 16 feature labels obtained with the help of "FFT" method for "pure" variant 


with are given in the Appendix 3 to this section. 

To utilize the advantages of the both methods for feature extraction, two 
combined feature sets were prepared (for the "Pure" variant): 1) "b+s" set: three 
features (“ Ipsmfp ”, ”6ssmfp ”, ”llrmspp ,r ) from "FFT" method were combined with 18 
features of "filtration" method; 2).> +b” set: five features (“9msprp0”, “lOmsprpl ” 
“Ilmsprp2”,“12msprp3”,”13msprp4” ) from "filtration" method were combined with 16 


features of “FFT 55 method. 


The processing stages 6-10 corresponding to feature selecting, error 
probability estimating and event classification procedures with accompanying data 
manipulations are performed in the framework of special SNDA script “selfeat.scr” 
composed with the programs 7 d”, "Idstst", fsel”, "reclld” y "examld” ,'examqd" 
(described in Section 6) and some SNDA stack commands. The last provides feature 
nonlinear transformation: y = log(x) and Box-Kox’ normalizing transformation: z = 
a( 1 ) with the exponent power a=l/7, visualization of three dimensional 
scattering diagrams of feature vectors with the help of SNDA Stack command 
"cluster" and the two dimensional scattering diagrams with the help of the standard 
UNIX routine "plotxy". The script code is given in the Appendix 4 to this section. 


1.4. Results of experimental data processing and discussion 


The statistical classification approach described in Section 2 was applied to 
discrimination of weak earthquakes and explosions in the region of Israel. The set of 
seismograms of weak events recorded by stations of the local Israel seismic network 
was used as learning data for classification. The set consisted of recordings of 28 





















19 


earthquakes with magnitudes 1.1-2.6 and 25 chemical explosions with magnitudes 
1.3-2.6; each event was recorded by several stations of the network. For eveiy event 
two vertical seismograms were selected: one recorded by a nearby station (with 
distance less than 100 km) and another - by a rather far station (with distance mostly 
from 100 km to 200 km). The location of earthquake and explosion epicenters and 
disposition of stations which have recorded the selected seismograms are shown in 
Fig. 3. One can see from this figure, that all events analyzed: as earthquake, as 
explosions, occurred in the same rather small region with size 80x80 Km. The 
identical geological conditions in the both type source areas, of course, facilitate the 
source type identification because the differences in source excitation mechanisms are 
not masking in seismograms by the impact of different hypocentral zone geological 
conditions (as it is often happened in identification practice). 

In Fig. 4. the sets of earthquake and explosion seismograms recorded at near 
and far distances are presented: onsets of P-waves are aligned, seismograms are 
ordered according to epicenter distances and scaled to the waveform maximum. 
Comparison of the event waveforms indicates that earthquake and explosion 
seismograms reveal some visual differences which are more explicitly manifested at far 
distance set. In Fig. 4c and 4d we see that the earthquakes S-waves are more powerful 
relatively to the P-waves in comparison with the explosions. This is evident for the 
majority of event wavetrains in spite of rather poor signal-to-noise ratio for some 
seismograms. Note that for the “near” distances (Fig.4a,b) this divergence of 
earthquake and explosion seismograms is not so explicit. The example depicted is 
encouraging one for employment of the conventional P-S spectral ratio discriminants 
in the problem of local earthquake and explosion discrimination. 

The interactive automated procedure for discrimination feature extraction from 
seismogram was described in Section 3. It is illustrated by the flow-chart in Fig.l. 
The two different methods used for measuring of P and S phase spectral 
characteristics are illustrated by Fig.5. Fig.5a shows the seismograms of local 
earthquake with magnitude 1.5 and depth 10.+-1.1 km registered at distances 48 and 
143 km. Note that the recordings are made with the rather good signal to noise ratio 
and the explicit P and S+Lg phases are seen at the seismograms. (The second phase 
of local earthquake which really is a superposition of different S-type phases and Lg 
phase we will shortly name as S-phase). 
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Fig.5b illustrates the method of power feature extraction from the different 
frequency bands of P and S phase waveforms with the help of Fast Fourier 
Transform. The figure shows the FFT power spectra of preceding seismic noise, P 
and S waveforms. The spectra are calculated for the “far” earthquake seismogiam in 
the time intervals (2-44), (47.5-56) and (65.3-73-.7) sec., correspondingly (these 
intervals are marked at Fig.5c by the vertical lines). The vertical lines at Fig.5b mark 
the margins of frequency bands used for evaluation of discriminant features, such as 
average and maximal values of P and S spectra estimated in eveiy frequency band. 

Fig.5c and Fig.5d illustrates the feature extracting with the help of band 
filtering. The traces 2-5 at Fig.5c are the output wavetrains of band-pass 5-th order 
Baterworth filters with frequency bands indicated at start of the traces. Fig.5d shows 
the S-phase current powers traces for frequency bands under consideration. The figure 
illustrates the procedure for evaluation of averaged and maximum values of the S- 
phase current power in different bands. 

The eight sets of learning feature vectors were prepared for eveiy class with the 
help of described automated feature extraction procedure (the earthquakes we denoted 
as 1 class, the explosions - as 2 class). The sets differ in the 3 parameters used in their 
notations (Fig. 2): parameter p-n means: noise compensation is implemented or not; 
parameter N-F means: “near” or “far” seismograms are used; parameter b-s means: 
the band filtering or the FFT method was employed for feature extraction. 
Comparison of classification results achieved by using of eveiy set allows to refine the 
statistical source identification technique being developed. The essence of these results 
is reflected in the Tables 1- 4 (see end of this section). General conclusions implying 
from analysis of the experimental study results are discussed in the Section 5. 

Let us consider in details the processing of "pFb" feature vector set by 
following the sequence of procedures composing the script "selfeatrs.scr" (described 
in Section 3). The notation "pFb" means that the “far” seismograms were used, noise 
compensation was implemented and band filtering was applied for feature extraction. 

The “value traces” of “pFb” features after their nonlinear transformations by 

functions y=log(x) and z = xt(y a - 1)> oc = 7/7, are shown in Fig.6. Along the horizontal 

X-axis the event numbers are indicated: the first 28 points correspond to earthquakes 
and the remaining 25 points - to explosions. Along the vertical axis of every trace the 
corresponding feature values are set in the integer X-points and linear interpolations 
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are made between integer points. The number and label of every trace is the same as 
for corresponding feature, the feature labels (which meanings are explained in the 
Appendix 2 of Section 3) are disposed at the left side of the plot. This graphical 
representation of feature vector sets allows to catch an impression about tendencies in 
feature behavior for different classes. For this example one can say, that features with 
numbers 9-18 differ appreciably in their means and dispersions corresponding to 
earthquake and explosion classes. This fact is encouraging for success of application 
to these features the statistical discrimination techniques. 

Analysis of feature pair correlation coefficients made in the program "Idstst" 
(see description of this program in Section 5) shows that feature pairs: (“9msprp0”, 
“12msprp3”) and (“llmsprp2”, “13msprp4 y ) are strongly correlated (their correlation 
coefficient exceeds 0.75 ). The program Idstst" recommended to eliminate features 
“9msprp0 "and “llmsprp2” because their one dimensional distributions have less 
Makhalanobious distances between the classes in comparison with their counterparts 
in the above pairs. 

The remaining 16 features were processed by the program ”fsel " which 
accomplished automated selection of the most informative features providing the 
minimum of error classification probability (see Section 2 for theoretical explanations, 
Section 5 for description of the program). Fig.7 illustrates results of the program 
execution for the “pFb” feature vector set. It shows Makhalanobious distance p 2 (p) 
and classification error probability Pr(p) in depend on the amount p of informative 
features selected at the first p steps. Minimum of Pr(p) is reached on step 5. The sense 
of first 5 features selected by the program ""fsel" from the total set of features 
composing the “pFb” learning vectors are given below (in the order of their 
discrimination rank): 

1) “18asprp4” = avp(S,D4)/avp(P f D4) - the ratio of S-phase and P-phase average 
powers in the highest frequency band (10-15) Hz; 

2) “7srbp3” — avp(S, D3)/avp(S,DO) - the fraction of S-phase averaged power belonging 
to the high frequency band (6-10)Hz; 

3) “lOmsprpl” = maxp(S,D1)/maxp(P,D1) - the ratio of S-phase and P-phase peak 
powers in the lowest frequency band (1-3)Hz; 

4) “17asprp3” — avp(S,D3)/avp(P,D3) - the ratio of S-phase and P-phase averag e 
powers in the high frequency band (6-10)Hz 
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5) “5srbpl” = avp(S,Dl)/avp(S, DO) - the fraction of S-phase averaged power belonging 
to the lowest frequency band (1-3)Hz; 

The features with rank numbers 1 and 4 characterizes the excess of share wave 
high frequency energy for earthquakes in comparison with explosions. This is well 
explained by the shift mechanism of earthquake sources typical for shallow events 
[14-17, 6 ]. The selection of feature with rank number 3 can be explained in the same 
way - by excess of P-phase low frequency energy for explosions in comparison with 
earthquakes that is typical for explosion mechanism. The appearance of features with 
rank numbers 2 and 5 is also connected with more high frequency content of S-waves 


for explosions in comparison with earthquakes. 

The SNDA graphic program "Cluster" allows to inspect the three dimensional 
scattering diagrams of selected features. Four such diagrams for different featuie 
triplets composed from the selected features are shown in Fig.8. The triplets of 
features with rank numbers (1,3,4) and (1,2,3) (see the list above) provide the best 
visual separation of the earthquake and explosion clusters. The diagram for (1,3,4) 


triplet seems to be the most impressive and it is shown in Fig.9 in a large scale with 
indication of the earthquake and explosion numbers. The numbers are given in 
according of the event lists used in the study (see Appendix 2 to Section 3). 
Explosions with numbers 9 and 23 produced the features close to the earthquake 
cluster and earthquakes with numbers 24 and 26 - to the explosion cluster. Thus 
scattering diagrams are an effective tool for revealing of features providing the best 


cluster separation and for detecting of outlying events: the earthquakes with 
characteristics similar those for explosions and vice versa. 

It becomes obvious from analysis of Fig.8 scattering diagrams that only 4 
misclassifications from total 53 observations may occur if to cut the 3-dimensional 
space of features with rank number's (1,2,3) or (1,3,4) by an appropriate plane. It is 
shown below that the linear discriminant function constructed on the basis of 5 
optimal features gives the exactly 4 wrongly classified events with mentioned numbers. 
The Fig. scattering diagrams indicate also that earthquake feature points demonstrates 
more tight clustering then explosions ones. This well corresponds to the fact that the 
optimal features have a greater dispersion for the explosions than for earthquakes (it is 


seen, for example from Fig.). 
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The estimation of error probability was made in this study by the three 
methods theoretically grounded in Section 2: 

1) Reclassification by the linear discriminator (program "recalled')', 

2) Examination (cross-validation) by the linear discriminator (program "exhaled')’, 

3) Examination (cross-validation) by the quadratic discriminator (program 

"examqd '). 

We will discuss the results for all these cases. 

1) The linear discrimination function (LDF) values calculated by the program 
“reclld” for the earthquake and explosion learning vector sets using the 5 optimal 
features are depicted in Fig. 10. To make the decisions an LDF value is to be 
compared with zero threshold: 

if LDF > 0, then the vector examined belongs to an explosion; 

If LDF < 0, then the vector examined belongs to an earthquake . 

One can see in Fig. 10 that one earthquake and two explosions were wrongly classified 
by the reclassification procedure, however one of correctly classified earthquakes 
produces the LDF value which is very close to the zero threshold (one may say that it 
lies in the “uncertainty” zone). 

2) The LDF values calculated by the program “examld” for the same as in the 
previous case 5 optimal features and earthquake and explosion learning sets are 
depicted in Fig. 11. Two earthquakes and two explosions were wrongly classified by 
this procedure. Emerging of four mistakes in the result of cross-validation procedure 
instead of three mistakes for reclassification procedure is natural: it is theoretically 
known that the reclassification method provides as a rule more optimistic error 
probability estimates as compared with cross-validation method. 

3) The values of quadratic discrimination function (QDF) calculated by the 
program “examqd” for the same as in the previous cases 5 features and earthquake 
and explosion learning sets are depicted in Fig. 12. Only two explosions from total 53 
events were wrongly classified by quadratic discriminator in the result of cross- 
validation procedure instead of four mistakes made by the LDF. Thus the total error 
probability estimate is equal in this case to 3.8%. Note, that the QDF provides also 
the rather wide “robust classification zone” which size is approximately equal to 20% 
of the amplitude between minimum and maximum QDF cross-validation values. Any 
shift of classification threshold inside this zone keeps the error probability constant 
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(equal to 3.8%). This allows to hope that moderate changes in computational 
procedures for extracting of features from seismograms should not lead to increasing 
of the discrimination error probability. It is interesting also, that the explosion QDF 
values compose the three explicit clusters, though for the earthquake QDF values 
exists only single cluster. This can be interpreted as testimony that different types of 
explosion sources have existed and in contrast, that the earthquake sources in the 

region have rather similar characteristics. 

The numbers of wrongly classified explosions are 9 and 23 (n accordance with 
the event list used). These are the same numbers as for the outlier explosion points in 
Fig.9 scattering diagram. The same numbers also have the explosions wrongly 
classified by the LDF (see Fig.10 and Fig.ll plots). The original and band filtered 
seismograms of these explosions are shown in Fig.13 and Fig.14. The misclassification 
of explosion with number 9 can be apparently explained by the insufficient duration 
of time interval at which the explosion wavetrain was registered: the S-phase 
waveform was cut off before the phase power begins to decrease. This could result in 
overestimating the average S-phase power in all frequency bands. As result, the event 


feature vector have got into the earthquake cluster. 

The misclassification of explosion with number 23 is almost evidently 
connected with its poor signal to noise ratio. The incorrect attempt to “improve the 
seismogram was made at the stage of preliminary data processing: the clipping of 
noise amplitude spikes was accomplished with the help of SNDA interactive tool (it is 
seen in Fig. 14 first seismogram). This could lead to incorrect noise compensation 
(this procedure was implemented while creating “pFb” feature vectors). 

The reclassification and cross-validation procedures for Linear Discriminator 
and cross-validation procedure for Quadratic Discriminator were accomplished also 
using the best three features ( “18asprp4”, “lOmsprpl ” “17asprp3 Q manually selected 
with the help of scattering diagrams. The classification quality for these best three 
features turned out to be the same as for the five optimal features automatically 
selected by the program “fsel”\ the same misclassification probabilities and the same 
numbers of wrongly classified events were achieved. Even the relative widths of 
“robust zones” between the clusters of points belonging to different classes are roughly 
the same for the both cases. This is convincing argument in favor of automated 


feature selection facility provided by the program “fsel”. 
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It is quite interesting to compare the results discussed above with ones 
produced by the same processing applied to the seismograms recorded by the “near” 
station (with epicenter distances less then 100 Km). The preprocessing of these 
seismograms generated the feature vector sets which we denoted as M pNb" data. The 
processing of this data by the feature selection - error probability estimation 
procedures gave the following. Five features (“4prbp4” t “lOmsprpJ”, “llmsprp2” f 
“14asprp0”, “lSasprpV) were eliminated due to strong pair correlation (exceeding 
0.75). The feature vectors composed by remaining 13 features were used as the input 
data for the program “fsel”. The course of automated feature selecting is illustrated by 
the plots in Fig. 15. The six optimal features were chosen: 

1) 18asprp4 = avp(S,D4)/avp(P,D4) - the ratio of S-phase and P-phase average powers 
in the highest frequency band (10-15) Hz; 

2) “5srbpl” = avp(S,Dl)/avp(S y DO) - the fraction of S-phase averaged power belonging 
to the lowest frequency band (l-3)Hz; 

3) “13msprp4” = maxp(S,D4)/maxp(P,D4) - the ratio of S-phase and P-phase peak 
powers in the highest frequency band (10-15)Hz; 

4) lprbpl = avp(P,Dl)/avp(P,DO) - the fraction of P-phase averaged power belonging to 
the lowest frequency band (l-3)Hz; 

5) 8srbp4 = avp(S,D4)/avp(S,D0) - the fraction of S-phase averaged power belonging to 
the highest frequency band (10-15)Hz; 

6) 9msprp0 — maxp(S,DO)/maxp(P,DO) - the ratio of S-phase and P-phase phase peak 
powers in the highest frequency band (10-15)Hz 

Note, that this feature set is different from the optimal feature set for the case 
of “far” station seismograms. However the two features: St 18asprp4" and u 5srbpl” are 
selected by the “fsel” program in the both cases, moreover the feature u 18asprp4” has 
the highest rank in the both cases and being used alone for discrimination provides 
the (theoretically estimated) error probability equal about 20%. Although (judging to 
our experiments) the station epicenter distances impact on the contents of optimal 
feature sets, the physical sense of all selected features remains the same: they reflect 
the relative S and P phase powers in the highest and lowest paits of frequency range 
being analyzed. 

Three dimensional scattering diagrams for different feature triplets taken from 
the optimal feature set are shown in Fig. 16. The clustering of the earthquake and 


























26 


explosion points in the ’’Near” station case is slightly worse then for the “Far” case. 
The feature triplet with the rank numbers (1, 2, 3) and labels (" 18asprp4”, “5srbpl”, 
”13msprp4”) seems to be the most instructive one. Note that again the triplet 
recommended by the program “fsel” as providing the steepest slope of error 
probability curve, is proved to be the most attractive from the point of view of 
clustering capability. The different projections of these three dimensional feature 
clusters of are shown in Fig. 17 (plot pairs (a)-(b) and (c)-(d) are the projections from 
the opposite directions but have the same rotation angles). This figure was depicted to 
convince a critical reader that the rather good separation of earthquake and explosion 
clusters is not illusion due to specially chosen displaying projection. Analysis of Fig. 17 
ensures us that no more than five classification mistakes from total 53 events can be 
get if to cut this triplet three-dimensional space by a proper separating plane. 

The error probability estimation accomplished on the basis “pNb” data with 
the help of programs "reclld", "examld", "examqd" gave the results presented in 
Fig.18-20. Comparison of these results with analogous ones for "pFb" data (Fig. IQ- 
12) shows that the discrimination capability of Quadratic Discriminator in the case of 
"pNb" data is slightly worse: four mistakes are got for cross-validation (7.6%) instead 
two ones (3.8%). However, for Linear Discriminator the results are quantitatively the 
same: four mistakes are got for cross-validation and three mistakes for reclassification 
in the both cases. Nevertheless, the “robust classification zones” between earthquake 
and explosion clusters for "pNb” data are narrower as compared with "pFb" data. 
These facts may be predicted from scrutinizing the seismograms registered by the 
"Far" and "Near" stations (Fig.4). 

The interesting result here is that the all earthquakes are classified correctly 
and the list of numbers of four wrongly classified explosions (2, 3, 25, 8 - for QDF 
and 2,8,25,14 - for LDF) does not include the numbers (9, 23) of misclassified 
explosions for the “Far” case. This confirms the conclusion made above that the 
wrong classifications of (9, 23) explosions are related with the shortages in event 
wavetrain registering. The “Near” station seismograms of wrongly classified 
explosions (2, 8, 25) are plotted in Fig.21. One can see from this figures that 
explosions 8 and 25 are registered at the very small distances: 24 and 28 km 
correspondingly, and for this reason the P and S phases are overlapping in the 
seismograms. This can result in S/P spectral ratios which are not typical for the 
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explosions. The explosion (2) was registered at the distance 89 Km and has the 
distinct P and S-phase waveforms. However one can see in the seismogram that the 
S-phase waveform contains an intensive high frequency pulse (having, probably, a 
technogeneous origin). This pulse undoubtedly brought additional high frequency 
energy in the average S-wave spectrum, that leads to wrong discrimination feature 
values for this event. 

The next part of the Section is devoted to analysis of robustness of the feature 
selection and error probability estimation techniques proposed. Deviations of 
classification results are studied in response of changes of structure or parameters of 
the processing algorithms. 

The change a value of the parameter Ot in the Box-Kox normalizing 

transformation: z~ c i(y a - 1), from 1/7 to 1/5 leads (for the case of “pFb” feature 

vectors) to deterioration of the classification quality. Execution of the program 
“redid”, ”examld” and “ examqd ” with “pFb” data subjected to transformation z~ 
(l/5)(yV$ -1) gives the results shown in Fig.22-24. Comparison these figures with 
Fig.10-12 reveals, that the program “ examqd ” gave three mistakes instead two 
mistakes (the event with number 11 was additionally wrongly classified), the programs 
“examld” and “redid” gave the same mistakes but the robustness of classification was 
significantly deteriorated: the “empty zones” between the LDF clusters corresponding 
to earthquakes and explosions, became nearer. Nevertheless, the numbers of wrongly 
classified events and events whose discrimination statistic values lie close to the 
threshold (in the “uncertainty zone”) did not change. 

The elimination of the Box-Kox transformation leads to increasing of the error 
classification probability up to 7.5% (see Table 1 and Table 3) and significantly 
worsen the classification robustness. Our experiments allow to asseit that the 
theoretically well grounded Box-Kox normalizing transformation proved to be helpful 
for the refinement of statistical discrimination technique. 

Effectiveness of noise compensation procedure being a pait of automated 
feature measurement algorithm was tested by analysis of discrimination capability of 
the "nFb" feature vectors produced while the noise compensation was omitted. The 
best three optimal features selected by the program “fsel” using M nFb" data appeared 
to be same as for "pEE" data: “18asprp4” y “lOmsprpl”, “17asprp3”. The QDF values 
for “nN3” vectors composed with these 3 features are depicted in Fig.25. Comparison 
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of this figure with analogous one for "pFb" data (Fig. 12) shows that the classification 
capability of “nFb” data is slightly worse than in the "pFb*' case (the program 
“examqd” gave three mistakes (5.7%) instead two ones (3.8%) in the “pFb” case). 
Nevertheless, the three “worst” explosions corresponding to three smallest QDF 
values had numbers 9,23,11 and are the same in both cases and. The remarkable fact 
is that the explosion with number 23 wrongly classified with “pFb” data, for “nFb” 
data gave the QDF value below the threshold, so was classified correctly. This 
confirms our hypothesis that it’s previous misclassification was due to improper noise 
compensation caused by the incorrect “far” seismogram preprocessing. 

The four “worst” earthquakes corresponding to the four largest QDF values 
had in “nFb” case numbers 13, 24, 26, 2, while in “pFb” case the three “worst” ones 
had numbers 24, 26, 2. We get here the good coincidence and may suspect that 
earthquake with number 13 was eliminated from the list of the “worst” earthquakes 
because the application of noise compensation procedure. The evident advantage of 
the noise compensation is enlarging of the “robust classification zone” which is 
manifested in the “pFb” case in comparison with “pNb” case. 

As it mentioned above, the combined data types ("b +s" and "s+b") were also 
prepared. The results of experiments with M pF(b+s) M data are presented in Fig.26-28. 
The four features “ 9msprp0 ” “llmsprp2” y “13msprp4” ) “MasprpO” were eliminated 
because strong pair correlation, exceeding 0.75. The remaining 17 features were 
processed by the program "fsel" which selected the next nine optimal features: 

1) 18asprp4 — av(S, D4)/av(P, D4) - the ratio of average S-power to average P-power in 
the highest frequency band (10-15)Hz; 

2) 7srbp3 = av(S y D3)/av(S } D0) - the fraction of average S-power belonging to the 
high frequency band (6-10)Hz; 

3) lOmsprpl = max(S y Dl)/max(P y Dl) - the ratio of maximum S-power to maximum P- 
power in the lowest frequency band (l-3)Hz; 

4) 17asprp3 = av(S y D3)/av(P y D3) - the ratio of average S-power to average P-power in 
the high frequency band (6-10)Hz; 

5) 19psmfp = fmax(P) - the frequency where maximum of P-spectrum is attained; 

6) Ssrbpl = av(S y Dl)/av(S y DO) -the fraction of average S-power belonging to the 
lowest frequency band (1-3)Hz; 

7) 21rmspp = Max(S)/Max(P)~ the ratio of maxim urns of P and S spectra; 
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8) Iprbpl = av(P,Dl)/av(P,DO) - the fraction of average P-power belonging to the 
lowest frequency band (1-3)Hz; 

9) 6ssmfp = fmax(S) -the frequency where maximum of P-spectrum is attained 
Let us emphasis the the following: 

1. The first four optimal features: “18asprp4”, “7srbp3”, “Wmsprpl”, 

“17asprp3” selected from “pF(b+s)” feature set coincide with the “best” four features 
for M pFb ,, data; 

2. The feature “Ssrbpl” was chosen on sixth step for "pF(b-hs)" data and on 
fifth step for "pFb" data; 

3. The features “19psmfp”, “6ssmfp” specific and quite essential for the "s" 
selection method were included in the optimal feature subset of the “pF(b+s)” data. 

The comparison of results of cross validation procedure applied to 9 optimal 
features of “pF(b+s)” data (Fig.28) with analogous results for 5 optimal features of 
"pFb" data (Fig. 12) shows that classification quality is same for the both data types: 
two mistakes (3.8%) was got with the same numbers (9,23) of wrongly classified 
events 

The total information about classification quality achieved using different 
variants of learning data set with application of the feature logarithmic 

transformation: y—ln(x), and additional Box-Kox's transformation z— (J/7)(y^^ - 1 A 

is accumulated in Tables 1-4. The analysis these tables allows to make the next 
conclusions: 

L Data from "far" stations provide as a rule a smaller error probability as 
compared with data from "near" stations. 

2. Employment of the Box-Kox normalizing transformation tangibly decreases 
the identification error probability if decision is made using statistical identification 
rules. 

2. "Pure" data got after the noise compensation procedure demonstrate usually 
smaller error probability as compared with "noise” data do not subjected to this 
procedure. 

3. Band filtering method for measurement of phase averaged and maximal 
powers in different frequency bands turn out to be more effective in sense of error 
probability than Fast Fouijer Transform method. This is experimental fact and we fail 
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to explain it at the moment. Possibly there is needed some refinement of the FFT 
method. 

4. The "b+s" learning vectors composed from the features measured by the 
band filtering method with addition of several specific features provided by the FFT 
method (see Section 3) gave in our experiments the same discrimination mistakes as 
the “b”-type learning vectors. Nevertheless we hope that this augmented assortment 
of the discrimination features will be the most effective in further experiments. 


Table 1 

Error probabilities cross-validations estimates (%) 
for Linear Discriminator with feature transformation 

by function y = log(x) 


Features 

Unselected 

Selected 

Unselected 

Selected 

Type of data 

pFbl6 

pFb8 

pFsl5 

pFs6 

Error probab. % 

11.5 

7.5 

7.5 

7.5 

Type of data 

nFbl2 

nFb8 

nFsl4 

nFs5 

Error probab. % 

11.0 

11.0 

9.4 

9.4 

Type of data 

pNbll 

P Nb3 

pNsl 3 

pNs4 

Error probab. % 

13.5 

11.0 

11.3 

9.4 

Type of data 

nNbl2 

nNb6 

nNsl 2 

nNs5 

Error probab. % 

9.4 9.4 9.4 

Numbers of events incorrectly classified 
by Linear Discriminator with transformation 
of features by function y = log(x) 

7.5 

Table2 

Features 

Unselected 

Selected 

Unselected 

Selected 

Type of data 

pFbl6 

P Fb8 

pFsl 5 

pFs6 

Earthquakes 

4, 14, 15 

4, 15 

5, 24 

15,21,24 

Explosions 

9, 11, 15 

9, 11 

9, 15 

9 

Type of data 

nFbl2 

nFb8 

nFs 14 

nFs5 

Earthquakes 

4, 14, 15, 21, 24 

4, 14, 15, 24 

14, 15, 21, 24 

14, 15, 21, 24 

Explosions 

9 

6,9 

9 

9 

Type of data 

pNbll 

P Nb3 

pNsl3 

pNs4 

Earthquakes 

20, 25 

20, 25 

20, 22, 25 

20, 22, 25 

Explosions 

2, 3, 6, 8, 14, 20 

2, 3, 14, 20 

2, 20, 25 

2, 20 

Type of data 

nNbl2 

nNb6 

nNsl 2 

nNs5 

Earthquakes 

20, 25 

20, 25 

20, 25 

20, 22, 25 

Explosions 

2, 14, 20 

2, 14, 20 

2, 20 

2, 14 
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Table 3 


Error probability cross validation estimates (%) for Linear and Quadratic 
discriminators with features transformation by functions 

y = log (x), z = 7(y 1 / 7 -l) 


Features 

Unselected 

Selected 

Unselected 

Selected 

Type of data 

pFbl6 

pFb5 

pFsl5 

P Fs7 

Discriminator 

LD QD 

LD QD 

LD QD 

LD QD 

Error probab. % 

13.8 9.6 

7.5 3.8 

20.7 5.8 

11.3 5.8 

Type of data 

nFbl4 

nFb3 

nFs 

nFs 

Discriminator 

LD QD 

LD QD 

LD QD 

LD QD 

Error probab. % 

9.4 7.5 

7.5 5.5 



Type of data 

P Nbl3 

pNb6 

P Nsl5 

P Ns8 

Discriminator 

LD QD 

LD QD 

LD QD 

LD QD 

Error probab. % 

9.4 7.5 

7.5 7.5 

11.3 9.4 

7.5 9.4 

Type of data 

nNb 

nNb 

nNs 

nNs 

Discriminator 

LD QD 

LD QD 

LD QD 

LD QD 

Error probab. % 





Type of data 

pF(b+s)18 

pF(b+s)9 

pF(s+b)18 

pF(s+b)9 

Discriminator 

LD QD 

LD QD 

LD QD 

LD QD 

Error probab. % 

13.2 5.7 

9.4 3.8 

17.0 7.5 

5.7 9.4 





Table 4 


Numbers of events incorrectly classified by Linear and Quadratic Discriminators 
with transformation of features by functions y = log(x); z = 7(y 1 / 7 -!) 


Features 

Unselected 

Selected 


Unselected 

Selected 

Type of data 

pFbl6 

pFb5 

pFsl5 

pFs7 

Discriminator 

LD 

QD 

LD 

QD 

LD QD 

LD QD 

Earthquakes 

24 

2,9 

24,26 

— 

24,13 13 


Explosions 

15,14,6 

9,23, 

9,23 

9,23 

15,14,9, 9,11 



23,25,9 

11 



25,23,28 







13,20 


Type of data 

nFbl4 

nFbll 

nFs 

n Fs 

Discriminator 

LD 

QD 

LD 

QD 

LD QD 

LD QD 

Earthquakes 

24,13 

2,13 

24,13 

13,24 



Explosions 

3,9,23 

23,9 

3,9 

9 



Type of data 

pNbl3 

pNb6 


pNs!5 

pNs8 

Discriminator 

LD 

QD 

LD 

QD 

LD QD 

LD QD 

Earthquakes 


6,8 



14 20,25,22 

—- 22,20,25,3 

Explosions 

2,8,25 

25,2 

2,8,25 

2,3,25 

20,2,8 2,25 

14,8 2 


20,14 


14 

8 

14,25 

2,25 

Type of data 

nNb 


nNb 


nNs 

nNs 

Discriminator 

LD 

QD 

LD 

QD 

LD QD 

LD QD 

Earthquakes 







Explosions 







Type of data 

pF(b+s)18 

pF(b+s)9 

pF(s+b) 18 

pF(s+b)9 

Discriminator 

LD 

QD 

LD 

QD 

LD QD 

LD QD 

Earthquakes 


2,9 

24 


24 2,13 

24 13,2,24 

Explosions 

15,14, 

9 

9,14, 

9,23 

14,15,25 9,11 

9,23 18,9 


2325,9, 


23,20 


9,23,18 
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1.5. Conclusions and recommendations 


1. The flexible automated technique for seismograms processing aimed to 
discriminating small earthquakes and explosions was developed. The technique was 
designed in the framework of the Seismic Network Data Analysis System (SNDA) - a 
problem-oriented programming shell developed at SYNAPSE Science Center 
/Moscow IRIS Data Analysis Center. The program package for statistical 
discrimination with selection of the most informative features and estimation of the 
error probability was built in the SNDA. This technique is intended to be a 
component of automated data analysis system for CTBT monitoring. 

The preliminary testing of the technique was made using teleseismic P-wave 
+ P-coda recordings of 32 nuclear explosions at Semipalatinsk test site and 35 
earthquakes occurred in Eastern Kazkhstan were analyzed . The data were the same as 
used in [6], The results of tins testing experiment were described in 120 j. A thorough 
investigation of discrimination capabilities of the statistical technique was undertaken 
on the basis of local event seismograms from 28 small earthquakes and 25 industrial 
chemical explosions registered by Israelis local seismic network. Spectral 
characteristics of the data are described in [19]. 

2. Proposed technique for seismogram discrimination is founded on the 
statistical approach and provides measurement of various spectral features of 
seismogram wave phases, selection of a set of those features which are optimal for 
earthquake-explosion discrimination in given region and making the decision about 
the tested event seismogram to attribute it to an explosion or earthquake. Our 
experiments showed that the conventional P-S spectral ratio discriminants were in 
every case included in the optimal feature set by the automated feature selection 
procedure along with other (regionally dependent) features. 

3. The powerful graphic tool was designed in the SNDA System for 3- 
dimensional visualization of clustering of feature vectors corresponding to earthquakes 
and explosions. Interactive manual selection with the help of this tool of the optimal 
feature triplets demonstrating the most distinct clustering confirmed the results of 
automated feature selection. We hope that the serious problem of “transportation” of 
discriminants efficient in one region to be used in another region can be facilitate by 
the automated and interactive feature selection procedures proposed. 
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4. The important component of proposed technique is the precise estimation 
of discrimination error probabilities. By comparison of different estimation methods it 
was proved that the cross-validation procedures have to be used as the consistent 
estimator of misclassification probability intrinsic to a given region. It is especially 
helpful if number's of learning earthquake and explosion observations are not so large. 

5. Transformations of feature by the nonlinear functions such as log(x) and 

Box-Kox’s normalizing function: z— a (y a - 1), considerably decrease the 

discrimination error probability while the conventional statistic linear and quadratic 
discriminators (optimal for Gaussian feature distributions) are used for decision 
making. 

6. Implementation of the noise suppression procedure provides as a rule an 
increasing of discrimination quality and allow to involve into discrimination 
processing seismograms with small signal to noise ratio. 

7. The proposed technique for seismogram discrimination was thoroughly 
tested by applying to discrimination between earthquakes and chemical explosions 
recorded by Israel local seismic network. The pre-selection of event seismograms 
registered by the network, implementation of noise refinement and feature selection 
procedures, feature nonlinear transformation and employment of statistic quadratic 
discriminator allowed to get for this data the average misclassification probability 
(estimated with the help of cross-validation procedure) equal 3.8% (only 2 events 
(explosions) were misclassified from 53 events). The classification mistakes can 
apparently be explained by the insufficient quality of seismogram recording and small 
signal-to-noise ratio. 

8. Capability of local event source discrimination tends to be improved with 
increasing of a distance of a recording station from an event source. In our 
experiments the misclassification probability of events recorded at distances of about 
60 km is equal to 7.8% while for the same events recorded at distances of about 140 
km, - to 3.8%. Besides, the robust classification zone for seismograms recorded at 
“far” distances" is wider as compared with “near” station seismograms. 

9. Possible improvements of the discrimination technique can be made in the 
following directions: 
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Extension of the set of relevant features automatically measured from 
seismograms (by including, for example, the features characterizing various functional 
distances between estimates of power spectra for different phases. 

Development of an advanced feature selection procedure. This can be made by 

following ways: 

a) by using some error probability theoretic formulae for nonlinear 
classification rules; such formulae are derived, for example, for quadratic 
discriminator [22,23] and it is natural to employ them at the stage of feature selection 
if the final decision making is to be done by the quadratic discriminator. 

b) by employment of more sophisticated (but more computer resource 
consuming) recurrent procedures for selection of a feature set providing a global 
minimum of discrimination probability. For example, the exhaustive search could be 
employed at the initial steps of recurrent procedure when the number of features 
being selected is not so great. 

c) by adoption of cross-validation error probability estimate for using at every 
step of recurrent feature selecting procedure instead of any theoretical formula. Such 
complex procedure is the most robust and flexible to changes of classification rules 
and can be used for optimization of the total discrimination scheme. 

Implementation of more sophisticated classification rules (providing separation 
of feature space by more complex functions than hypo-plane and hypo-sphere as it 
was used in our experiments). Such a rule can be designed on the basis of learned 
neural network, but application of this technique for estimating of misclassification 
probably by cross validation method or for feature selection is extremely time 


consuming. 























Fig. 1. Data preparing for classification. 
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Fig. 3. Location of Israel local network stations and seismic events 

used in source discrimination study. 

a,b. Location of stations recorded earthquakes and explosions, c. Location of 
earthquake and explosion sources, d. Magnitudes and focal depths of events. 
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Fig. 4. g e t of event seismograms used in discrimination study. 
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Two seismograms from stations at different distances were used for each event. 





















































































































































Fig. 6. “Value traces” of pFb features 
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Fig. 8. Four diagrams for different feature triplets 
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Fig. 10. Linear discrimination function calculated by the 
program “reclld” for the earthquake and explosion 
learning vector sets using the 5 optimal features. 
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Fig. 11. Linear discrimination function calculated by the 
program “examld” for the earthquake and explosion 
learning vector sets using the 5 optimal features. 
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Fig. 12. Quadratic discrimination function 
calculated by the program “examqd” for the earthquake 
and explosion learning vector sets using the 5 optimal features 
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Fig. 17. Different projections of three dimensional space 
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Fig. 18. Results of the error probability estimation 
accomplished on the basis “pNb” data 
with the help of program; "reclld". 
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Fig. 19. Results of the error probability estimation 
accomplished on the basis “pNb” data 
with the help of program: " examld 




























Number of observations Number of observations 

rearranged by QD (class 1) rearranged by QD (class 



15 

14 + + + + + — class 1 


ooooo — class 2 


Fig. 20. Results of the error probability estimation 
accomplished on the basis “pNb” data 
with the help of program. " examqd 
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Fig. 22. Results of execution of the program “redid” 
with”pFb” data transformed by function z— (l/5)(y 1 / 5 -1). 
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Fig. 24. Results of execution of the program “examqd” 
with”pFb” data transformed by function z— (1/5)(y 7 / 5 - 1). 
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Fig. 25. Results of QDF cross-validation 
for “nFb” data with 3 optimal features. 
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Fig. 26. Results of LDF reclassification 
for combined “pF(b+s) ” data with 9 optimal features. 
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Fig. 27. Results of LDF cross-validation 
for combined “pF(b+s) ” data with 9 optimal features 
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Fig. 28. Results of QDF cross-validation 
for combined u pF(b+s) ” data with 9 optimal features. 
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1*6. Description of program package 
for statistical identification of seismic source type 

1.6.1 Program "LD" 

Input data for learning and classification in the SNDA stack 

The program "LD" is designed to provide some auxiliary transformations of 
feature vectors intended for facilitate treating of learning vectors and vectors with 
unknown class attributes by the subsequent procedures of statistical classification 
package. It reads the vector data from input files and writes them to the System 
SNDA Stack and some auxiliary files with the puipose to deliver them to the next 
procedures. In result every channel of the System SNDA Stack contains data of 
someone feature. So, if the number of features is equal p, the p channels are produced 
in the Stack. The feature values corresponding to 1 and 2 class learning vectors and 
vectors to be classified are arranged in the channels sequentially. So, if the number of 
learning vectors from class 1 is equal to rip, from class 2 ~ ti 2 and number of 
inattributed vectors (from unknown classes) is equal to n& then the number of every 
channel points is (ni+nj^no)- In the reclassification mode the learning vectors serves 
also as the vectors to be classified, so they are written to the Stack twice. In this mode 
no=n]+fi 2 and the total number of stack channel points is equal to 2(7/y+/?2^ To 
provide the reclassification mode the program “Id” reads the vectors for classification 
from two distinct input files. 

The allocation of the data in the SNDA stack allows a user to display them at 
the screen with the help of the SNDA graphic facilities in the form of “channel 
traces”. This facilitates the manual selection of the most informative features and 
detecting of data outliers. The parameters of the learning data and the data for 
classification are saved to the file g£c/ot and are used by the all other programs of 
statistical classification package (participating in given identification session). 

Program input parameters 

All input parameters of the program are to be contained in the file “ld.inp”. 
Example of this input file is given below 

****INPUT FILE FOR PROGRAM "LD” (2 CL. VERSION) : standard**** 

NUMBER OF CLASSES (maximum 2 in this version) 

2 

NUMBER OF FEATURES (maximum 25) 

18 
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NUMBERS OF LEARNING VECTORS (maximum 100) 

28 25 

NAME OF FILE FOR 1 CLASS LEARNING VECTORS 
data/troitsky/iearpFp.dat 

NAME OF FILE FOR 2 CLASS LEARNING VECTORS 
data/troitsky/iexplpFp.dat 
NAME OF FILE FOR FEATURE LABELS 
data/troitsky/lblpb.18 

NAME OF FILE 1 FOR VECTORS FOR CLASSIFICATION 
data/troitsky/iearpFp.dat 

NAME OF FILE 2 FOR VECTORS FOR CLASSIFICATION 
data/troitsky/iexplpFp.dat 

Explanation of input file parameters 
NUMBER OF CLASSES (2 in this version) 

The parameter defines the number of classes assumed in the given 
classification session. In this simple version of the package the number of classes is 
restricted by two classes and this adjusting is not necessary 
NUMBER OF FEATURES (maximum 25) 

The parameter defines the number of features in learning and inattributed 
vectors. This value is delivered to the other package programs participating in the 
given classification session 

NUMBERS OF LEARNING VECTORS (maximum 100 in this version) 

These two parameters define the numbers of learning vectors from class 1 and 

class 2 

NAME OF FILE FOR LEARNING VECTORS OF CLASS 1 (input file) 

This is a name of file containing learning vectors of class 1. The file must have 
the form of ASCII matrix; the eveiy column of the matrix is composed by someone 
feature observations. The number of columns is equal to the number of features p. 
The number of rows is equal to the number of observations n j for class 1. 

NAME OF FILE FOR LEARNING VECTORS OF CLASS 2 (input file) 

This is a name of file containing learning vectors of class 2. The file must have 
the form of ASCII matrix; the every column of the matrix is composed by someone 
feature observations. The number of columns is equal to the number of features p. 
The number of rows is equal to the number of observations /?2 for class 2. 

NAME OF FILE FOR FEATURE LABELS (input file) 

This is a name of file containing feature labels. The labels must consist of no 
more 8 ASCII symbols and be written as the column. 
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NAME OF FILE 1 FOR VECTORS FOR CLASSIFICATION (input file) 

This is a name of file containing vectors for classification. In the plain 
classification mode this file contains inattributed vectors belonging to unknown 
classes. In the reclassification mode this file contains learning vectors from class 1. 
The file must have the form of ASCII matrix; the every column of the matrix is 
composed by observations of someone feature. The number of columns is equal 
number of features p. The number of rows is equal number of observations no of 
inattributed vectors to be classified. 

NAME OF FILE 2 FOR VECTORS FOR CLASSIFICATION (input file) 

This parameter is valid only for reclassification mode. In this case the directory 
and name of file contains learning vectors from class 2 must be assigned here. The 
format of this file is the sam as one described above. For the mode of classification of 
new unattributed vectors the name of file has to contain the no less then 5 blanks. 


1.6.2. Program "LDSTST" 

Evaluation of standard learning data statistics 

The program "ldstst" provides a user by means for manual selecting of the 
identification features and calculates the general statistics of the learning data used in 
the subsequent programs of the statistical classification package It accomplishes the 
following actions: 

1) Reading of learning vectors and vectors for classification from the System 
SNDA Stack. The vectors read contain only subset of the features selected for further 
processing and specified by the program input parameter. The selecting of features 
can be, in particular, made with the help of imaging by the SNDA graphic tool the 
“feature traces” (i.e. graphical representation of feature value sequences for the 
observations of 1 and 2 class and observations to be classified) or feature 3-dimension 
scattering diagrams 

2) Forming of output files with class 1 and class 2 vectors composed by the 
specified features. The files are used by the SNDA program CLUSTER providing the 
3 dimensional scattering diagrams of the features. 

3) Calculating of the following statistics of learning data: 
a) the mean vectors m(k) for every class k = 7,2; 
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c) the sample covariance (p x p) matrix C (where p is the number of features 
selected); 

b) the (p x p) matrix R which elements rQl) are the estimates of correlation 
coefficients between the features with numbers j and /; the maximum absolute value 
of correlation coefficients rQl) for all j,/—l,...,p and numbers j max , l max of the 
features, for which this value is attained; 

e) Makhalanobious distances d(k), k=l,...,p of the one-dimensional feature 

distributions corresponding to 1 and 2 classes, the minimum value of d(k) and 
number k m ,„ of feature for which its value is attained. 

The information obtained in items d) and e) is displayed on the screen and 
allows to choose the proper feature set for the further processing by excluding one of 
the feature from highly correlated pairs and/or the features with small 
Machalanobious distances. 

4) Plotting of the 2 dimensional scattering diagrams of the features with the 

help of standard UNIX routine "plotxy". 

5) Forming of auxiliary output data files for the subsequent package programs. 

The mean vectors m(k) are calculated by the equation: m(k)— (l/nQ(X k ( 1) 
+...+ X k Qi0), k = 1,2, where Xrfi) is the /-th feature vector from class k ; 

The elements rQl) of matrix R are defined by the formula: rQl) - cQI) / 
(cQj)cQl)) 1 / 2 , j,l = 1,2,...,p, where cQI) are the elements of unnormalized covariance 
matrix C: 

2 n k 

C = X { Z (Xk(i)-m(k))(X k Q)-m(k))T} 
k-1 k-1 

T - is the sign of vector transposition. 

In the geometrical sense m(k) are the "centers" of classes in the p-dimensional 
feature space. The large distance between the "centers" of classes implies the low error 
probabilities of discrimination. The analysis of matrix R allows one to reveal groups of 
uncorrelated features and/or pairs of strong correlated features. It is recommended to 
eliminate one feature of the pair if |r(jl) | > 0.75 for this pair. It is also recommended 
to eliminate the features with small one dimensional Makhalanobious distances 
between distributions of 1 and 2 class. The selecting of features is accomplished by 
assigning the proper Stack channel numbers in the program input file. 
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Scatterplots for different feature pairs of can be exposed on the screen with the 
help of standard UNIX routine "plotxy". This plotting is managed in the following 
manner: one of the features is being chosen as the "basic", and p-1 plots with 
scattering diagrams of this feature with the remaining ones are displayed on the 
screen. At every plot the points with X-Y coordinates equal to given features pair 
values for the 1 & 2 class observations are drown on the screen. The points 
corresponding to different classes are plotted by different symbols. As a result, some 
clusters of points attributed to different classes can be seen on the diagrams. By 
analyzing the scatterplots a user may select the most separated pairs of features. 
Besides that, it is possible to estimate the proximity of feature two-dimensional 
distributions to the Gaussian one . 

Program input parameters 

All input parameters of the program are to be contained in the file “Idstst.inp”. 
Example of this input file is given below 

****INPUT PILE FOR PROGRAM "LDSTST": standard**** 

STACK CHANNELS (FEATURES) TO BE USED: 8; (1-8 10 12-18); all 

(18 7 10 17 5) 

NAME OF OUTPUT FILE FOR VECTORS FOR CLASSIFICATION 
data/troitsky/newvect .dat 

NAME OF OUTPUT FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF LEARNING 
VECTORS 

data/troitsky/resku. dat 
NAME OF GRAPHIC CONTROL FILE 
plot/troitsky/gr.gr 

NUMBER OF REFERENCE FEATURE FOR SCATTERING DIAGRAMS 
1 

NAME OF OUTPUT FILE WITH CLASS 1 VECTORS (FOR CLUSTER IMAGING) 
data/troitsky/clastl.dat 

NAME OF OUTPUT FILE WITH CLASS 2 VECTORS (FOR CLUSTER IMAGING) 
data/troitsky/clast2.dat 
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Explanation of program input parameters 
STACK CHANNELS (FEATURES) BEING USED (input information) 

This string defines the numbers of features selected for the subsequent analysis 
. If the all features are kept, "all" must be written here. In other case numbers of 
features (channels) to be kept are given according to notation CHANNELS used for 
the SNDA Stack commands. 

NAME OF OUTPUT FILE FOR VECTORS FOR CLASSIFICATION 

This is a name of file with unattributed vectors (to be classified) composed 
with the selected features. 

NAME OF OUTPUT FILE FOR MEAN VECTORS AND COVARIANCE 
MATRIX OF LEARNING VECTORS 

This is a name of file containing the mean vectors m(k), k=l,2 for every class; 
the normalized correlation matrix R; the unnormalized covariance matrix C for 
selected set of features and some seivice information needed for execution of the 
following programs. 

NAME OF GRAPHIC CONTROL FILE 

This is a name of control file for the "plotxy" UNIX routine containing 
commands for displaying feature scattering diagrams. 

NUMBER OF THE REFERENCE FEATURE FOR SCATTERING DIAGRAM 

This feature is used for design of p-1 scattering diagrams. On every diagram 
this feature values corresponding different observations are set along the X axis, and 
along the Y-axis - the values of one from remaining features. 

NAME OF OUTPUT FILE WITH CLASS 1 VECTORS (FOR CLUSTER 
IMAGING) 

This is a name of file containing selected feature vectors of class I. The file is 
used by the other programs of the package, in particular, the SNDA program 
“CLUSTER” providing the three dimensional plotting of feature scattering diagrams. 
NAME OF OUTPUT FILE WITH CLASS 2 VECTORS (FOR CLUSTER 
IMAGING) 

This is a name of file containing selected feature vectors of class 2. The file is 
used by the other programs of the package, in particular, the SNDA program 
“CLUSTER” providing the three dimensional plotting of feature scattering diagrams. 
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1.6.3. Program "FSEL" 

Automatic selection of informative features 
providing minimum of classification probability error estimate 

The program "fsel" accomplishes the following operations: 

1) the automatic stepwise selecting of most informative features providing the 
least error probabilities for the classification based on the given set of learning data; 

2) calculating of a function R(k), k — , which is a Makhalanobious 

distance in depend on an amount k of features used for classification; 

3) calculating of a function P(k), k-l,...,p which is a theoretic total 
probability of classification errors in depend on an amount k of features used for 
classification; 

4) calculating of the value ko for which the function P(k) attains its minimum 
(k(f=argmin P(k))\ 

5) plotting of the calculated functions R(k) and P(k) on the screen using the 
standard UNIX routine "plotxy" with displaying on this plot the numbers and labels 
of features chosen at every selection step. 

The Makhalanobious distance between two ^-dimensional distributions of k 
feature vectors (corresponding to 1 and 2 classes) is defined by the quadratic form: 
R(k) = (m(k,2) - m(k,l)) T S- J (k) (m(k,2) - m(k.l)), 

where: m(k,l), m(k,2) - are the sample mean vectors of learning data for 1 and 2 
classes calculated for a k feature vectors; S(k) = C(k)/(nl+n2) is the sample 
covariance matrix calculated for a k feature vectors using total learning data for the 
both classes; T is the sign of vector transposition. 

At the first step of the selecting procedure p values of the R( 1) functional are 
calculated for every feature. The maximum of these p values is attained at some j( 1) 
feature which is thus selected. At the second step p-1 values of the R(2) functional are 
calculated for the pairs of features: the first member of this pairs is always the 
previously selected feature j(l), the second member - is an arbitrary feature from the 
remaining ones. Then the second feature is selected which ensure the maximum of 
these R(2) values. At the fc-th step of this selecting procedure p-k+1 values of the 
R(k) functional are calculated for k feature vectors. The first k-1 components in these 
vectors are the features which were selected at the previous steps, the k- th component 
is an arbitrary feature from the remaining ones. On the each step k = of the 
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selecting procedure the number j(k) and label of the feature selected are stored and 
the theoretical value of total error probability P(k) is calculated. The value P(k) (for 
k>l ) is calculated by the formula: 

P(k) = (1/2)[1 - T k {R(k)/c(k)) + T k {-R(k)/ c(k))J, 

where 

o 2 (k) = f(t+l)/t][rj+r 2 +R(k)]; t = [(ri+rfi/rp-^-J; n=k/nj; r 2 =k/n 2 
Tk(z) - F(z) + (l/(k-l) ) (ai - a 2 H](z) + agHfz) - a 4 Hfz))f(z.), 

F(z) - is the cumulative function of standard Gaussian probability distribution ; f(z) is 
the density of this distribution; Hj(z) is the Hermitian polynomial with order i, 
i=l,2,3; aj, j=l,...,4 are some coefficients depending on k, tip n 2 and R(k) [ ]. 

This foimula was derived via an asymptotic expansion of the distribution 
function for conventional linear discriminator under the assumption that the number 
of features p and numbers of learning vectors nj, n 2 for both the classes are 

simultaneously increasing with the same rate. 

The program "fsel" then defines a number kg of that selecting step for which a 
minimum of function P(k), k=l,...,p, is attained: ko — orgntin P(k). Thus the optimal 
set of features with the numbers j(l), j(2),..., j(kO) become determined. These features 
provide the minimal total error probability. 

Program input parameters 

All input parameters of the program are to be contained in the file “fsel.inp”. 
Example of this input file is given below 

****INPUT FILE FOR PROGRAM "FSEL": stanard**** 

NAME OF FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF LEARNING DATA 

data/troitsky/resku.dat 

NAME OF FILE FOR GRAPHIC RESULTS: 

plot/troitsky/mygr.gr 


Explanation of input file parameters 

NAME OF FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF 
LEARNING DATA 

This is name of file containing the covariance matrix and mean values of the 
features, calculated by the program "ldstst". 

NAME OF FILE FOR GRAPHIC RESULTS (output file): 
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This is name of control file for the UNIX routine "plotxy", containing the commands 
providing displaying of Makhalanobius distance curve R(k) and probability error curve 
P(k) , along with the numbers and labels of selected features. 

Program output files with standard names 

1) File "perr.dat": the file contains values of classification error probabilities, 
corresponding to selected features in the order of their selection in the program. 

2) File "jnum.dat": the file contains the selected feature numbers in the order 
of their selection in the program. 

3) File "sumr.dat": the file contains values of Makhalanobious distances 
corresponding to selected features in the order of their selection in the program. 

1.6.4, Program "RECLLD" 

Reclassification of learning vectors by the linear discriminator 

The program "redid" accomplishes the following operations: 

1. Classifying of given unattributed feature vectors Xj,...,X r i\s belonging to one 
of two classes. The classification is performed by the linear discrimination function 
(LDF) which is “learned 55 using the estimates of class mean vectors and covariance 
matrix provided by the “ldstst 5 program on the basis of learning data. There exists the 
two modes: classification of "new" vectors Xf, and the reclassification mode when the 
learning vectors from classes 1 and 2 are regarded as unattributted and are newly 
classified by the LDF (previously learned with their help). 

2. Plotting of the ranked LDF values corresponding to vectors attributed by the 
program to classes 1 and 2. The plot is produced with the help of standard UNIX 
routine "plotxy". 

A classification in the program "redid" is performed using the LDF. The input 
data for the program are the statistics of the learning feature vectors: sample mean 
vectors m(k) for every class k=l,2, and sample covariance matrix S (the same for 
both classes) and feature vectors Xj,...,X r to be classified. For every class k=J,2 the 
program calculates the "informants": 

I(k) = X T S' 1 (m(k) - (1/2) m(k) T S' 1 m(k)), k = 1,2. 
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Then the program calculates number ko providing a maximum for the values I(k), 
k=l,2 and thus classifies the vector X as belonging to the class with number ka This 
decision making procedure is equivalent to comparison of a value of the Linear 
Discrimination Function: 

L=I(2) - 1(1) 

with the threshold equal 0. 


Program input parameters 

All input parameters of the program are to be contained in the file 
“reclld.inp”. Example of this input file is given below 

*****INPUT FILE FOR PROGRAM "RECLLD": standard***** 

NAME OF INPUT FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF LEARNING 
DATA 

data/troitsky/resku.dat 

NAME OF FILE FOR VECTORS FOR CLASSIFICATION 
data/troitsky/newvect. dat 

CLASSIFICATION MODE: RECLASSIFICATION OF LEARN. VECTORS = (DE¬ 
CLASSIFICATION OF NEW VECTORS = 1 
0 

FEATURES TO BE USED OF UNSELECTED = 0; SELECTED = 1 
0 

NAME OF OUTPUT FILE FOR CLASSIFICATION RESULTS 
data/troitsky/ldres.dat 

NAME OF GRAPHIC CONTROL FILE FOR PLOTTING OF CLASSIFICATION RESULTS 
plot/troitsky/reclplt.cnt 

Explanations of program input file parameters 
NAME OF INPUT FILE FOR MEAN VECTORS AND COVAR1 AN CE MATRIX 
OF LEARNING DATA 

This is a name of file for reading of the Linear Discrimination Function 
parameters. File is produced by the program "ldstst". 

NAME OF INPUT FILE FOR VECTORS FOR CLASSIFICATION 

This is a name of file containing the unattributted vectors which have to be 
classified by the program. File is produced by the program "ldstst”. 
CLASSIFICATION MODE: RECLASSIFICATION OF LEARN. VECTORS = 0; 
CLASSIFICATION OF NEW VECTORS = 1 

The parameter switches the program to one of two modes connected with 
origin of vectors to be classified. 

FEATURES TO BE USED OF UNSELECTED - 0; SELECTED = 1 





















45 


The parameter defines the feature set to be used for the classification of 
unattributed vectors. If 0 is assigned, the program uses for classification the initial 
feature set, if 1 is assigned, only the features selected by the program "fsel" are used in 
the classification procedure. 

NAME OF OUTPUT FILE FOR CLASSIFICATION RESULTS 

This is a name of file contains the table with numbers of unattributted vectors 
and numbers of classes to which this vectors is related in the result of classification. 
NAME OF GRAPHIC CONTROL FILE FOR PLOTTING OF CLASSIFICATION 
RESULTS 

This a name of control file for the UNIX routine "plotxy" containing the 
commands for displaying of the program classification results. 

1.6.5. Program "EXAMLD 

Estimation of error probability by cross-validation method 

using linear discriminator. 

The program "examld" calculates by the statistically consistent cross-validation 
method (synonyms: Jack-knife, "plug-in” method) the estimates of classification error 
probabilities which Unear discrimination function (LDF) provides for given set of 
learning observations. The LDF values obtained for eveiy class learning vectors are 
ranked in their magnitudes. The points corresponding this two ranked sets are 
displayed on the screen by using the standard UNIX routine "plotxy". This provides 
the descriptive graphical representation of the linear discriminator classification 
capability in regard of the given set of learning observations. 

The equations for calculation of linear discrimination function are given in the 
description of program “redid”. The estimates v( 1\2), v(2\l), of first and second types 
of classification error probabilities p(l\2), p(2\l) are calculated in the program by the 
equations: 

v(l\2) = n(l\2)/n 2 , v(2\l) = n(2\l)/m , 
where: n(j\k) - is an amount of vectors from class k attributed by the LDF to 
class j in the result of cross validation procedure, j,k-l,2, n * - is a total amount of 
vectors from class k—1,2. The estimate of the averaged classification error probability 
p av . is defined as following: v^y = (\/2){v(l\2)+v(2\l) and the estimate of the total 
classification error probability p rot . is defined by the formula: 

























46 


Vtot=(n 2 /n t0t )v(l\ 2 ) +(ni/n to! )v(2\ 1), where n 10t =n 1 +n 2 . 

The well known cross-validation procedure consists of n t0t =iij+n 2 steps. On 
every step one of the learning vectors belonging to the class j 0=1,2) is eliminated 
from the learning set. The remaining vectors are used as the learning data for 
calculating the LDF adjusting parameters. The eliminated vector is then classified by 
this LDF. If this vector is classified incorrectly, the value n(k\j), k=3-j is increased by 
the unit. The eliminated vector is then returned into the learning set and then the 
next vector is extracted. This procedure is repeated with the all learning vectors. 
Values v(l\2), v(2\l), v^ and v tot , evaluated by the cross-validation method is 
asymptotically unbiased estimate of the classification error probabilities p(l\2), p(2\l) 
Pav and ptot, when n j, n 2 tend to infinity with the same rate. The program displays the 
values v(l\2), \(2\1), Vav and v tot (in percents) on the screen. 

The program calculates also the sequences Li(i), i=l,...,nj and L/j) j=l,...,n 2 
of LDF values obtained during cross validation procedure for the obseivations from 
the class 1 and 2 respectively. These sequences are ranked in the order of their 
magnitudes and are plotted on the screen by using the standard UNIX routine 
"plotxy". Values Li(i) are marked by the symbol values Lij) - by the symbol "o". 
The plots of Lj(i) and L^j) functions enable a user with following possibilities: to 
detect incorrectly classified observations; to select the obseivations corresponding to 
the "uncertain" classification area defined by the condition :-c < L(i) < c, where c 
equal to some fraction of A — (maxLjj) - minLj(i)) ; and at last, to estimate 
distributions of LI, L2 for correction of the classification threshold in accordance 
with these distributions. 


Program input parameters 


All input parameters of the program are to be contained in the file 
“examld.inp”. Example of this input file is given below 

****INPUT FILE FOR PROGRAM EXAMLD: standard**** 

NAME OF FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF LEARNING DATA 
data/troitsky/resku.dat 

FEATURES TO BE USED OF UNSELECTED = 0; SELECTED = 1 
0 

NAME OF FILE FOR GRAPHIC RESULTS 
plot/troitsky/examld.gr 
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Explanation of the program input parameters 

NAME OF FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF 
LEARNING DATA 

This is a name of file for reading of the Linear Discrimination Function 
parameters. File is produced by the program "ldstst". 

FEATURES BEING USED: UNSELECTED = 0, SELECTED = 1 

The parameter defines the feature set to be used for the cross validation 
procedure. If 0 is assigned, the program uses in this procedure the initial feature set, if 
1 is , only the features selected by the program "fsel" are used in the procedure. 
NAME OF GRAPHIC CONTROL FILE FOR PLOTTING OF CLASSIFICATION 
RESULTS 

This a name of control file for the UNIX routine "plotxy" containing the 
commands for displaying of the program classification results. 


1.6.6. Program "EXAMQD" 

Estimation of error probability by cross-validation method 

using quadratic discriminator. 


The program "examqd" calculates by the statistically consistent cross-validation 
method (synonyms: Jack-knife, "plug-in” method) the estimates of classification error 
probabilities which quadratic discrimination function (QDF) provides for given set of 


learning observations. The QDF values obtained for eveiy class learning vectors are 
ranked in their magnitudes. The points corresponding to this two ranked sets are 
displayed on the screen by using the standard UNIX routine "plotxy". This provides 
the descriptive graphical representation of the linear discriminator classification 
capability in regard of the given set of learning observations. 

Quadratic discrimination function from feature vector X to be classified is 


defined by the equation: 


P P P 

Q(X) = £ (xj-mfl)) 2 /^ 1) - X (xj-nij(l)) 2 /of(2) + £ l/i(o 2 ( J)/oj 2 (2)), 

j=l j=l J =1 
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where: xj is y-th component of vector X being classified; mfk) is y'-th component of 
sample mean values vector m(k) for learning vectors from class k—l,2\ Of(k) is the 
sample dispersion of y-th component for learning vectors vector from class k=l,2. 

If Q(X) > 0, the quadratic classification algorithm relates the vector X to the class 2, 


else - to the class 1. 

The equation above is derived under assumption that features are statistically 
independent. This assumption leads to simplified version of well known Quadratic 
Discrimination Function but allows to avoid difficulties accompanying to inversion of 
often near to singular sample covariance matrices for classes 1 and 2 

The well known cross validation procedure for the case of QDF is completely 
the same as one for LDF (explained in the description of program “examld”). The 
program "examqd" generates the same output data as the program “examld”. 

Comparison of the probability error estimates provided by the programs 
“examld” and “examqd” allow to chose the most appropriate statistical classification 
algorithm: LDF or QDF, that guarantees the least misclassification probability for 
those statistical characteristic of the features which have been revealed from analysis 


of the learning vectors. 


Program input parameters 

All input parameters of the program are to be contained in the file 
“examqd.inp”. Example of this input file is given below 

****INPUT FILE FOR PROGRAM EXAMLD: standard**** 

NAME OF FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF LEARNING DATA 
data/troitsky/resku.dat 

FEATURES TO BE USED OF UNSELECTED = 0; SELECTED = 1 
0 

NAME OF FILE FOR GRAPHIC RESULTS 
plot/t roit sky/examqd.gr 

Explanation of the program input parameters 
NAME OF FILE FOR MEAN VECTORS AND COVARIANCE MATRIX OF 
LEARNING DATA 

This is a name of file for reading of the Quadratic Discrimination Function 
parameters. File is produced by the program "ldstst". 

FEATURES BEING USED: UNSELECTED=0, SELECTED = 1 
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The parameter defines the feature set to be used for the cross validation 
procedure. If 0 is assigned, the program uses in this procedure the initial feature set, if 
1 is assigned, only the features selected by the program "fsel" are used in the 
procedure. 

NAME OF GRAPHIC CONTROL FILE FOR PLOTTING OF CLASSIFICATION 
RESULTS 

This a name of control file for the UNIX routine "plotxy" containing the 
commands for displaying of the program classification results. 
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1.8. Appendix 


Appendix 1 

# SCRIPT FOR SELECTION OF EXPLOSIONS FOR CLASSIFICATION 

# script troitsky/isrdtpr/selexpll.scr 
. char path[] ="/detseis/seis/alex/data/israel/explos/" 

. int i 


. char dt[26][10], k[26][6], 

f [26] [10] , 

r[26][10] 

9 

d[26] [10] 


# 

identiier initial keep 

flush poor 

final keep 

dist: near far 

. & & 

t[l]="8906201018"; k[l]=”42"; 

f[1]="35 

40"; 

r[1]="12 

23"; 

d[1]="49 

134" 

. & & 

t[2]="8911081109"; k[2]="42"; 

f [2]="35 

40"; 

r[2]="17 

22"; 

d[2]="87 

144" 

. && 

t[3]="9003271037"; k[3]="42"; 

f [3]="35 

40"; 

r[3]="14 

25"; 

d[3]="77 

160" 

. && 

t[4]="9004041301"; k[4]="42"; 

f [4]= H 35 

40"; 

r[4]="16 

22"; 

d[4] =" 61 

137" 

. && 

t[5]="9006130742"; k[5]="42"; 

f [5]=”35 

40"; 

r [5] =" 7 

22"; 

d[5]="28 

140" 

. && 

t[6]="9007231222"; k[6]="42"; 

f [6]= M 35 

40"; 

r[6]="12 

19"; 

d[6]="39 

92" 

. && 

t[7]="9007260957"; k[7]="42"; 

f[7]=”28 

29"; 

r [7 ] =" 7 

23"; 

d[7]="24 

192" 

. && 

t[8]="9008120753"; k[8]="42"; 

f [8]="28 

29"; 

r[8]="7 

13"; 

d[8]="4 6 

145" 

. && 

t[9]="9008281424"; k[9]="36"; 

f [9]= M 28 

29"; 

r[9]="10 

15"; 

d[9]="37 

91" 

* & & 

t[10]="9010161405"; k[10]="42"; 

f [10]= M 36 

39"; 

r[10]="11 

23"; 

d[10]="52 

137" 

.&& 

t[11]="9011061243"; k[ll]="42"; 

f [11]= ,, 36 

39"; 

r[10]="11 

23"; 

d[11]="37 

133" 

. && 

t[12]="9012181223"; k[12]="42"; 

f[12]=“36 

39"; 

r[12]="21 

23"; 

d[12]="97 

137" 

. && 

t[13]="9101091411"; k[13]="42"; 

f [ 13]="36 

39"; 

r[13]="18 

24"; 

d[13]="72 

136" 

. & & 

t[14]="9102111255"; k[14]="42"; 

f[14]="36 

39"; 

r[14]="12 

35"; 

d[14]="56 

231" 

. & & 

t[15]="9102121205"; k[15]="42"; 

f [15]="36 

46"; 

r[15]="12 

23"; 

d[15]="59 

135" 

. && 

t[16]="9103121106”; k[16]="46"; 

f[16J="36 

4 6"; 

r[16]="17 

27"; 

d[16]="78 

155" 

. && 

t[17]="9103170920"; k[17]="46"; 

f [17]=”36 

4 6"; 

r[17]="17 

27"; 

d[17]="69 

185" 

. & & 

t[18]="9103191402"; k[18]="48"; 

f [18]="36 

4 6"; 

r[18]="21 

28"; 

d[18]="71 

136" 

. && 

t[19]="9103211208"; k[19]="50"; 

f [19]=”36 

46"; 

r[19]="17 

28"; 

d[19]="60 

134" 

. && 

t[20]="9104140830"; k[20]="50"; 

f [20]=”36 

46"; 

r[20]="22 

28"; 

d[20]="68 

142" 

. && 

t[21]="9105061241"; k[21]="50"; 

f [21]="36 

4 6" ; 

r[21]="10 

28"; 

d[21]="34 

150" 

. && 

t[22]="9105081304"; k[22]="50"; 

f [22]=“36 

46"; 

r[22]="22 

29"; 

d[22]="71 

134" 

. && 

t[23]="9106170643"; k[23]="44"; 

f [23]= M 36 

44"; 

r[23]="16 

17"; 

d[23]="52 

72" 

&& 

t[24]="9108121303"; k[24]="50"; 

f [24]=”36 

44"; 

r[24]="22 

31"; 

d[24]="70 

164" 

.&& t[25]="9108181502"; k[25]="50"; 

for (i=l; i<26; i=i+l) 

f [25]="36 

44"; 

r[25]="12 

33"; 

d[25]="37 

160" 


clearstack 

&& Unix uncompress &path.&t[i]..a.Z ; readdem &path.&t[i]..a 
filterB all -L2.0 -H15.0 -THP -P2 

&& keep &k[i]; plot all -y; flush (&f[i]); episortl; plot all -y 
savepack &path.&t[i] . .pk 

&& flist all plot/troitsky/maskl; map plot/troitsky/israel. par 
&& keep (&r[i]); plot all -y ; savepack &path.&t[i].sel.pk; 

&& unix compress &path.&t[i] . .a ; pause 
endfor 
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Appendix 2 


# SCRIPT FOR MEASURING OF EARTQUAKE P AND S POWER FEATURES 

# script troi/isrdtpr/earthmeas. scr 

. char nr[28], far[28], text[80], evnm[28] , evtp[80] 

. char dist[10], path[80], qualn[10], qualf[10] , textl[80] 

. int count, eye, mod, nc, ncl 


# ASSINING TIME INTERVALS OF P AND S PHASES: mod = 0 ; 

# PARAMETER MEASURING: mod = 1 

. mod = 0; evtp ="EARTHQUAKE"; path ="data/israel/earthq/" 


char identif[29][12] 
&& identif[1] 

&& identif[2] 

&& identif[3] 


&& identif[4] 
&& identif[5] 
&& identif[6] 
&& identif[7] 
&& identif[8] 
&& identif[9] 
&& identif[10] 
&& identif[ll] 
&& identif[12] 
&& identif[13] 
&& identif[14] 


"8802241537" 

”8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 

"8802241537" 


identif[15] 
identif[16] 
identif[17] 
identif[18] 
identif[19] 
identif[20] 
identif[21] 
identif[22] 
identif[23] 
; identif[24] 
; identif[25] 
; identif[25] 
; identif[27] 
; identif[28 ] 


"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 

"8802241537 


. for (cyc=l; cyc<29; cyc= cyc+1) 
evnm = identif[eye] 

sndfll=0; sndfl2=0; sndf21=0; sndf22=0; sndf31=0; sndf32=0 
. sndf41=0; sndf42=0; sndf51=0; sndf52=0; sndf61=0; sndf62=0 

perform switcher # SET NEXT EARTHQUAKE PARAMETERS 
clearstack 
dist = "distance=" 

perform readwave # READING WAVETRAIN 
for (count =0; count <2; count = count+1) 

perform multfilt # WAVETRAIN MULTIBAND FILTERING 
. when (mod =1) 

perform calcspec # CALCULATING SPECTRA 

perform measuring # MEASURING PHASE BAND POWER FEATURES 
perform savepar # SAVING SPECTRAL FEATURES FOR SIGN+NOISE 
echo GO TO FAR EVENT 
. endwhen 

flush 5 
. endfor 

echo END OF EVENT FEATURE MEASURING 
echo GO TO ANOTHER EVENT 
. endfor 
. return 


########################## EVENT SWITCER ######################## 


# LIST OF EVENT INFORMATION 

# NEAR EVENTS: 

# sndfll = Start point of noise (sec) 
= Length of noise (sec) 

= Start point of P-wave (sec) 
= Length of P-wave (sec) 

= Start point of S-wave (sec) 


# sndf12 

# sndf21 

# sndf22 

# sndf31 



























# sndf32 = Length of S-wave (sec) 

# FAR: EVENTS: 

# sndf41 = Start point of noise (sec) 

# sndf42 = Length of noise (sec) 

# sndf51 = Start point of P-wave (sec) 

# sndf52 = Length of P-wave (sec) 

# sndf61 ss Start point of S-wave (sec) 

# sndf62 = Length of S-wave (sec) 

. switcher block 

when (cyc=l) # St JVI, St DOR 

&& qualn = "h"; nr = ”82"; sndfll =0.1; sndfl2 =34 

&& sndf21 =35.5; sndf22 =5; sndf31 =47; sndf32 =10 

&& qualf = "h"; far = "140"; sndf41 =0.1; sndf42 =43 

&& sndfSl =44.5; sndf52 =5; sndf61 =62.5; sndf62 =9 

else when (cyc=2) # St JVI, St DOR 

&& qualn = "h"; nr = "82”; sndfll=21.862; sndfl2 =6.4! 

&& sndf21 =31.2; sndf22 =5; sndf31 =42.8; sndf32 =8.4 

&& qualf = "m"; far = "144"; sndf41 =1.1; sndf42 =34. S 

&& sndf51 =39.5; sndf52 =6.2; sndf61 =59.2; sndf62 =4.7 

elsewhen (cyc=3) # St ADI, St PRNI 

&& qualn ="h"; nr = "37.5";sndfll =1.; sndfl2 =25.£ 

&& sndf21 =29. ; sndf22 =3.9; sndf31 =34.; sndf32 =2.6 

&& qualf ="m" ; far = "59.5";sndf41 =19.36; sndf42 =11.6 


&& sndf51 =33.4; 

sndf52 =4.9; 

sndf61 =41.; 

sndf62 =4.3 

elsewhen (cyc=4) 

# St DSI, 

St ZNT 


&& qualn ="m" ; 

nr = "??"; 

sndfll =1.9 

; sndf12 =35.8 

&& sndf21 =42.8; 

sndf22 =4.7; 

sndf31 =58.6 

; sndf32 =12.2 

&& qualf = "m"; 

far = "??"; 

sndf41 = 3 . ; 

sndf42 =53. 

&& sndf51 =61.4; 

sndf52 =9.2; 

sndf61 =94.; 

sndf62 =13.6 

elsewhen (cyc=5) 

# St ZNT, 

St DSI 


&& qualn ="h" ; 

nr = "90"; 

sndfll =0.9; 

sndf12 =41.6 

&& sndf21 =44.9; 

sndf22 =4.5; 

sndf31 =56.8 

; sndf32 =8.4 

&& qualf ="h"; 

far = "157"; 

sndf41 =0.9 

; sndf42 =51.6 

&&sndf51 =55.5; 

sndf52 =8; 

sndf61 =76; 

sndf62 =9 

elsewhen (cyc=6) 

# St JVI, 

St DSI 


fi&qualn = "1"; 

nr = "83"; 

sndfll = 19.: 

3; sndf12 =10.9 

&& sndf21 =50.; 

sndf22 =4.7; 

sndf31 =61.1 

; sndf32 =9.5 

&& qualf = "1"; 

far = "123"; 

sndf41 =0.7 

; sndf42 =52.9 

&& sndf51=55.8; 

sndf52 =4.6; 

sndf61 =71.9, 

; sndf62 =9.7 

elsewhen (cyc=7) 

# St MML, 

St DSI 


&& qualn ="h"; 

nr = "63"; 

sndfll = 2. ; 

sndf12 =35.5 

&& sndf21 =39.6; 

sndf22 =2.8; 

sndf31 =48.1, 

sndf32 =4.6 

&& qualf ="m"; 

far = "152"; 

sndf41 =1.2 

sndf42 =45.7 

&& sndf51 =53.6; 

sndf52 =5.7; 

sndf61 =72.4, 

sndf62 =5.9 

elsewhen (cyc=8) 

# St MML, 

St DSI 


&& qualn ="m" ; 

nr = "71"; 

sndfll =1.9; 

sndf12 =25.9 

&& sndf21 =34.6; 

sndf22 =5.3; 

sndf31 =44.; 

sndf32 =13.8 

&& qualf ="h"; 

far = "159"; 

sndf41 =1.5; 

sndf42 =43.8 

&& sndf51 =47.9; 

sndf52 =6. ; 

sndf61 =68.2; 

sndf62 =6.8 

elsewhen (cyc=9) 

# St KSHT, 

St MSDA 


&& qualn ="1" ; 

nr = "56"; 

sndfll =2.1 ; 

sndf12 =30.4 

&& sndf21 =35.9; 

sndf22 =5. ; 

sndf31 =45.9; 

sndf32 =5.5 

&& qualf ="1" ; 

far = "164"; 

sndf41 =3. ; 

sndf42 =48.8 

&& sndfSl =54.7; 

sndf52 =7.2; 

sndf61 =75.4; 

sndf62 =10.1 

elsewhen (cyc=10) 

# St MML, 

St ZNT 


&& qualn ="h" ; 

nr = "22"; 

sndfll =2.1; 

sndf12 =22.9 

&& sndf21 =28.1; 

sndf22 =3.0; 

sndf31 =32.0; 

sndf32 =6.3 

&& qualf ="m" ; 

far = "39"; 

sndf41 =1.8; 

sndf42 =27.4 

&& sndf51 =30.6; 

sndf52 =4.1; 

sndf61 =37.5; 

sndf62 =4.3 

elsewhen (cyc=ll) 

# St MML, 

St ZNT 


























&& qualn ="h"; 
&& sndf21 =33.; 
&& qualf ="h n ; 
&& sndf51 =47.7; 
elsewhen (cyc=12) 


&& 

qualn ="m"; 

nr = 

"64"; 

&& 

sndf21 =43.2; 

sndf22 

=5.9; 

&& 

qualf ="m"; 

far = 

"135"; 

&& 

sndf51 =53.7; 

sndf52 

=6.4; 


nr = "48"; sndfll =2.2; 

sndf22 =4.4; sndf31 =39.1; 
far = "143"; sndf41 =2.0 ; 
sndf52 =8.1; sndf61 =65.2; 
# St HMDT, St DSI 
nr = "64"; sndfll =2.1; 

sndf22 =5.9; sndf31 =52.1; 
far = "135"; sndf41 =1.8 ; 
sndf52 =6.4; sndf61 =70.9; 


elsewhen (cyc=13) 
&& qualn ="h" ; 
&& 8ndf21 =35.9; 
&& qua If ="m"; 

&& sndf51 =48.1; 
elsewhen (cyc=14) 
&& qualn ="1"; 

&& sndf21 =40.7; 
&& qualf ="1" ; 

&& sndf51 = 48.8; 
elsewhen (cyc=15) 
&& qualn ="m"; 

&& sndf21 =41.8; 
&& qualf ="m" ; 

&& sndf51 =49.6; 
elsewhen (cyc=16) 
&& qualn ="h"; 

&& sndf21 =48.1; 
&& qualf ="h" ; 

&& sndf51 =60.5; 
elsewhen (cyc=17) 
&& qualn ="h"; 

&& sndf21 =40.4; 
&& qualf ="h"; 

&& sndf51 =50.6; 
elsewhen (cyc=18) 
&& qualn ="h" ; 

&& sndf21 =39.7; 
&& qualf ="h" ; 

&& sndf51 =49.1; 
elsewhen (cyc=19) 
&& qualn ="m" ; 
&& sndf21 =39.8; 
&& qualf ="m" ; 

&& sndfSl =45.8; 
elsewhen (cyc=20) 
&& qualn ="h"; 

&& sndf21 =43.3; 
&& qualf ="h" ; 

&& sndf51 =55.4; 
elsewhen (cyc=21) 
&& qualn = "m"; 
&& sndf21 =31.4; 
&& qualf = "m" ; 
&& sndf51 =36.5; 
elsewhen (cyc=22) 
&&qualn ="h" ; 
&&sndf21 =44.2; 

&& qualf ="m" ; 

&& sndf51 =56.2; 
elsewhen (cyc=23) 
&& qualn ="m" ; 
&& sndf21 =30.7; 


# St MML, 
nr = "41"; 

sndf22 =4.8; 


St SDOM 
sndfll =2.; 
sndf31 =41.5; 


far = "189"; 

sndf41 

=3.5 ; 

sndf52 =4.8; 

sndf61 

=56. ; 

# St ZNT, 

St BGI 


nr = "66"; 

sndfll 

= 2.5; 

sndf22 =7.2; 

sndf31 

=51.8; 

far = "120"; 

sndf41 

=3.5 ; 

8ndf52= 4.6; 

sndf61 

= 64.9 

# St ZNT, 

St BGI 


nr = "69"; 

sndfll 

=1.5; 

sndf22 =5.; 

sndf31 

=49.6; 

far = "122"; 

sndf41 

=1.7 ; 

sndf52 =6.1; 

sndf61 

=65.2; 

# St HMDT, 

St DSI 


nr = "66"; 

sndfll 

=2. ; 

sndf22 =6.6; 

sndf31 

=57.1; 


far = "143"; 
sndf52 =8.6; 


sndf41 
sndf61 


* 2 .; 
:79. ; 


# St 

JVI, 

St YTIR 

nr = 

"72"; 

sndfll = 8.9; 

s ndf2 2 

=5.7; 

sndf31 =49.6; 

far = 

"137"; 

sndf41 =2.2 ; 

sndf52 

=7.; 

sndf61 =67.1; 

# St 

GVMR, 

St JVI 

nr = 

"67"; 

sndfll =1.1 ; 

sndf22 

=5.1; 

sndf31 =48.9; 

far = 

"131"; 

sndf41 =1.6; 

sndf52 

=8.1; 

sndf61 =65.9; 

# St 

HMDT, 

St JVI 

nr = 

" 65"; 

sndfll = 1.7; 

sndf22 

=7.2; 

sndf31 =48.4; 

far = 

"103"; 

sndf41 =1.7; 

sndf52 

=9.9; 

sndf61 =59.; 

# St 

HMDT, 

St DSI 

nr ="65.5"; 

sndfll =1.6 ; 

sndf22 

=7.5; 

sndf31 =52.4; 

far = 

"143"; 

sndf41 =2.3 ; 

sndf52 

=12.5; 

sndf61 =72.4; 

# St 

KSHT, 

St HMDT 

nr = 

"30"; 

sndfll =2.5; 

sndf22 

=2.5 

sndf31 =35.; 

far = 

" 63"; 

sndf41 =5.9 ; 

sndf52 

=5.9; 

sndf61 =44.5; 

# St 

HMDT, 

St DSI 

nr = 

"65"; 

sndfll = 2.5; 

sndf 22 

II 

• 

** 

^ • 

sndf31 =53.4; 

far = 

"143"; 

sndf41 =2.5 ; 

sndf52 

=8.9; 

sndf61 =73.8; 

# st 

MML, 

St BRNI 

nr = 

"48"; 

sndfll =1.2 ; 

sndf22 

=5.5; 

sndf31 =37.7; 


sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndfl2 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 
sndf42 
sndf62 

sndf12 
sndf32 


























• 

&& 

qualf = "1"; 

far = 

"56"; 

sndf41 =1.4 ; 

sndf42 =24.7 

• 

&& 

sndfSl =34.6; 

sndf52 =4. ; 

sndf61 =41.3; 

sndf62 =13.2 

m 

elsewhen (cyc=24) 

# St 

HRI f 

St HMDT 


m 

&& 

qualn ="m" ; 

nr = 

"48"; 

sndf11 =2.; 

sndf12 =28.4 

* 

&& 

sndf21 =33.3; 

sndf22 

=5.8; 

sndf31 =40.6; 

sndf32 =13.7 

• 

&& 

qualf ="1" ; 

far = 

"65"; 

sndf41 =2.2 ; 

sndf42 =28.2 

m 

&& 

sndf51 =35.7; 

sndf52 

=6.4; 

sndf61 =44.6; 

sndf62 =12.2 

u 

elsewhen (cyc=25) 

# St 

HMDT, 

St DSI 


• 

&& 

qualn ="h"; 

nr = 

" 64 " ; 

sndf11 = 2.4; 

sndf12 =38.6 

• 

&& 

sndf21 =43.9; 

sndf22 

=5.1; 

sndf31 =52.6; 

sndf32 =17.3 

• 

&& 

qualf ="h"; 

far = 

"141"; 



sndf41 =2 

. 1 ; sndf42 =50.1 




• 

&& 

sndfSl =55.9; 

sndf52 

=7.2; 

sndf61 =73.2; 

sndf62 =21.2 


elsewhen (cyc=26) 

# st 

MML, 

St HMDT 



&& 

bqualn ="m"; 

nr = 

"4 6" ; 

sndf11 = 1.7; 

sndf12 =26.7 


&& 

sndf21 =30.8; 

sndf22 

=4.9; 

sndf31 =37.5; 

sndf32 =6.4 


&& 

qualf ="1" ; 

far = 

"63"; 

sndf41 =1.1 ; 

sndf42 =29. 


&& 

sndf51 =33.5; 

sndf52 

= 6.2; 

sndf61 =42.6; 

sndf62 =8.3 


elsewhen (cyc=27) 

# st 

MML, 

St DSI 



&& 

qualn = "h"; 

nr = 

"82"; 

sndf11 =1.7 ; 

sndfl2 =41.4 


&& 

sndf21=45.6.; 

sndf22 

=7.3; 

sndf31 =54.8; 

sndf32 =18.7 


&& 

qualf ="h"; 

far = 

"172"; 

sndf41 =2.2; 

sndf42 =53.4 


&& 

sndf51 =59.3; 

sndf52 

=9.8; 

sndf61 =80.; 

sndf62 =18.5 


elsewhen (cyc=28) 

# st 

HRI, 

St MSDA 



&& 

qualn = "h"; 

nr = 

"83"; 

sndf11 =2. ; 

sndf12 =30.8 


&& 

sndf21=35.2.; 

sndf22 

=4.7; 

sndf31 =45.8; 

sndf32 =6.4 


&& 

qualf = "h"; 

far = 

"155"; 

sndf41 =2.4; 

sndf42 =40.7 


&& 

sndf51 =47.5; 

sndf52 

=7.5; 

sndf61 =66.8; 

sndf62 =12.9 


.endblock of switcher 

########## READING WAVETRAINS AND FILTERING ################## 

. block readwave 
. text = "0 -3. 0.14" 

textl = "SPECTRA OF EARTHQUAKE PHASES AND PRECEEDING NOISE" 
Unix /usr/bin/rm data/troitsky/spctl.bb 
. savevar data/troitsky/spctl. bb text textl 

echo READING WAVETRAIN 

text = "sel.pk" 

readpack &path.fievnm.&text 

notes (1) 1 0.7 m &cyc &evtp &evnm 

notes (1) 1 0.4 m &dist &nr 

notes (2) 1 0.7 m &cyc fievtp &evnm 

notes (2) 1 0.4 m &dist &far 

plot all -y 

#savepack fipath.&evnm.&text 
.endblock 

########## WAVETRAIN MULTIBAND FILTERING ################## 

. block multfilt 

. echo WAVETRAIN MULTIBAND FILTERING 
filterB (1) -LI. -H3. -R5 -TBP -P2 

notes (1) 1 0.7 m 1-3 Hz 

&& move 1 2; filterB (1) -L3. -H6. -R5 -TBP -P2 

notes (1) 1 0.7 m 3-6 Hz 

&& move 1 3; filterB (1) -L6. -H10. -R5 -TBP -P2 

notes (1) 1 0.7 m 6-10 Hz 

&£ move 1 4; filterB (1) -L10. -H15. -R5 -TBP -P2 

notes (1) 1 0.7 m 10-15 Hz 
move 1 5 


. float vert 

























vertical 

. when (count =0) 

vertical b o fisndfll; 
vert = sndfll + sndfl2 

&& vertical b o &vert; vertical u o &sndf21 
vert = sndf21 + sndf22 

&& vertical u o fivert; && vertical r o &sndf31 
vert = sndf31 + snd£32 
vertical r o fivert 
. else 

vertical b o &sndf41 
vert = sndf41 + sndf42 

&& vertical b o &vert; vertical u o &sndf51 
. vert as sndf51 + sndf52 

&& vertical u o &vert; vertical r o &sndf61 
. vert = sndf61 + snd£62 
vertical r o fivert 
. endwhen 
plot 5 
.endblock 

################# CALCULATING SPECTRA ###################### 
. block calcspec 
. echo CALCULATING SPECTRA 
. char epidis[25] 

. when (count = 0) 

. && epidis = nr; text = "nr" 

winon fisndfll &sndfl2 
. else 

epidis = far; text = "far" 
winon &sndf41 &sndf42 
. endwhen 

&& powspec (1) 50 5; rename (1) NOISE__&text 

&& notes (1) 1 0.7 m &cyc &evtp &evnm fidist Sepidis 

winoff 

. when(count = 0) 

winon &sndf21 &snd£22 
. text = "nr" 

. else 

winon &sndf51 &snd£52 
text = "far" 

. endwhen 

&& powspec (2) 50 5; rename (1) P-WAVE__&text 

&& notes (1) 1 0.7 m &cyc &evtp fievnm &dist fiepidis 

winoff 

. when (count = 0) 

winon &sndf31 &sndf32 
text = "nr" 

. else 

winon &sndf61 &sndf62 
text = "far" 

. endwhen 

&& spec (3) 50 5; rename (1) S-WAVE_&text 

&& notes (1) 1 0.7 m &cyc &evtp fievnm &dist fiepidis; winoff 

#-#-#-#-#-#-#-#-#-#- 

. echo PLOTTING AND SAVING EVENT SPECTRA 
#plotspec (1-3) -GO 

fplotspec (1-3) -GO -Cdata/troitsky/spctl.bb 

#plot all -y 

#pause 

#unix lpr -PI plot/snda.ps 


























&& chanon 3; savepack fipath. ievnin. &text. sp.pk 
&& chanoff; flush 3 
#pause 
. endblock 

############################################################# 

. block measuring 

. echo MEASURING PHASE BAND POWER FEATURES 

. echo SAVING EVENT NAME, STAION DISTANCES AND PHASE INTERVALS 
. text = " " 

. savevar data/troitsky/earthpf.bb eye 
. savevar data/troitsky/earthpf .bb evtp evnm 

. text = "noise_start, noise_JLen, P__start, P_len, S_start, S_len 
. when (count = 0) 

. savevar data/troitsky/earthpf .bb dist nr 
savevar data/troitsky/earthpf .bb text ; 
savesnd data/troitsky/earthpf.bb 
sndf 11, sndf 12, sndf21, sndf22, sndf31, sndf32 
. else 

. savevar data/troitsky/earthpf .bb dist far 
savevar data/troitsky/earthpf .bb text 
savesnd data/troitsky/earthpf. bb 
sndf41,sndf42 f sndf51,sndf52,sndf61,sndf62 
. endwhen 
sqr (1-5) total 
. echo NOISE MEASURING 
.float strt, len 
. when (count = 0) 

&& strt = sndf11; len = sndf22 
. else 

&& strt = sndf41; len = sndf42 
. endwhen 
. float np[5] 

mean (1-5) fistrt &len np 
. text = "Av N band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb np 

#-#-#-#-#-#-#-#- 

. echo P-WAVE MEASURING 
. when (count = 0) 

&& strt = sndf21;len = sndf22 
. else 

. && strt = sndf51; len = sndf52 

. endwhen 

# P+N MAX POWER IN FRQ BANDS 

. float tmx[5], pamx[5] 

tmax (1-5) &strt &len tmx pamx 

#. text = "Max(P+N) band squar amplitudes" 

#. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb pamx 

winon &strt &len 
. float tmxw[5] 

. for (nc =0; nc <5; nc « nc+1) 
tmxw[nc] = tmx[nc] - strt 
. endfor 

&& vertical; vertical r o &tmxw[0] ; vertical b o &tmxw[l] 

&& vertical u o &tmxw[2] ; vertical r o &tmxw[3] ; 
vertical u o &tmxw[4] 

. text = "P-wave" 

&& notes (1) 1 -0.4 m fitext; plot 5 -y; winoff 


# ‘Pure’ P MAX POWER IN FRQ BANDS 






















I 




. float ppm[5], aux[l], pnpm[5] 

. for (nc =0; nc <5; c = nc+1) 

. tmxw[nc] = tmx[nc] - 0.5 

mean (&ncl) &tmxw[nc] 1 aux 
. && ppm[nc] = aux[0]; pnpm[nc] = ppm[nc] - np[nc] 

. endfor 

. text = "Max(P+N) band powers" 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb ppm 
. text = "'Pure' MaxP band powers" 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb pnpm 

# P+N AVERAGED POWERS IN FRQ. BANDS 
. float pp[5] 

mean (1-5) &strt &len pp 
. text = "Av(P+N) band powers" 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb pp 

# ’PURE' P AVERAGED POWERS IN FRQ. BANDS 
.float pnp[5] 

. for (nc =0; nc <5; c = nc+1) 

P n P[nc] = PP[nc] - np[nc] 

. endfor 

. text = "'Pure’ AvP band powers" 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb pnp 

# P+N SPECTRAL SHAPE; ’PURE' P SPECTRAL SHAPE 
. float prb[4] , prbn[4] 

. for (nc =0; nc<4) nc = nc+1) 

. prb [nc] = pp[nc+l] / pp[0] 

. prbn[nc] = pnp[nc+1] / pnp[0] 

.endfor 

. text = "Av(P+N) ratios of band powers to total power" 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb prb 

. text = "'Pure 1 AvP ratios of band power to total power" 
. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb prbn 

# -#-#-#-#-#-#-#- 

. echo S-WAVE MEASURING 
. when (count = 0) 

strt = sndf31; len = sndf32 
. else 

&& strt = sndf61; len = sndf62 
. endwhen 

# S+N MAX POWER IN FRQ BANDS 
. float samx[5] 

tmax (1-5) & strt &len tmx samx 
winon &strt &len 
. for (nc-0; nc<5; nc = nc+1) 

. tmxw[nc] = tmx[nc] - strt 
.endfor 


&& vertical; vertical r o &tmxw[0] ; vertical b o &tmxw[l] 

&& vertical u o fitmxw[2] ; vertical r o &tmxw[3] ; vertical u o &tmxw[4] 
. text = "S-wave" 
























&& notes (1) 1 -0.4 m &text; plot 5 -y 
winoff 

. for (nc-0; nc<5; nc = nc+1) 

. tmxw[nc] = tmx[nc] - 0.5 
.endfor 

. float spm[5], snpm[5] 

. for (nc-0; nc<5; nc = nc+1) 

&& ncl=nc +1; spm[nc] =aux[0] ; snpmfnc] =spm[nc] -np[nc] 
mean (&ncl) &tmxw[nc] 1 aux 
. endfor 

. text = "Max(S+N) band powers” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb spm 
. text = " 'Pure 1 MaxS band powers” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb snpm 

# S+N AVERAGED POWERS IN FRQ BANDS 
. float sp[5] 

mean (1-5) fistrt &len sp 
. text = ”Av(S+N) band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf.bb sp 

# "PURE" S AVERAGED POWERS IN FRQ. BANDS 
. float snp[5] 

. for (nc=0; nc<5; nc = nc+1) 

. snp[nc] = sp[nc] - np[nc] 

. endfor 

. text = "’Pure* AvS band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf.bb snp 

# S+N SPECTRAL SHAPE; 'PURE' S SPECTRAL SHAPE 

. float srb[4], srbn[4] 

. for (nc=0; nc<4; nc = nc+1) 

srb[nc] = sp[nc+l] / sp[0] 
srbn[nc] = snp[nc+1] / snp[0] 

. endfor 

. text = ”Av(S+N) ratios of band powers to total power" 

. savevar data/troitsky/earthpf .bb text ; 

. savevar data/troitsky/earthpf .bb srb 

. text = " ’Pure’AvS ratios of band powers to total power" 

. savevar data/troitsky/earthpf .bb text ; 

. savevar data/troitsky/earthpf .bb srbn 

# -#-#-#-#-#-#-#-#-#- 

. echo CALCULATION OF S-P FREQ. BAND POWER RATIOS 

# RATIOS OF SQUERS OF MAX S+N AMPL TO MAX P+N AMPL 
float mspar[5], mspr[5] , mnspr[5], spr[5], nspr[5] 
for (nc=0; nc<5; nc = nc+1) 

mspar[nc] = samx[nc] / pamx[nc] 

mspr [nc] = spm[nc] / ppm[nc] 

mnspr[nc] = snpm[nc] / pnpm[nc] 

spr [nc] = sp[nc] / pp[nc] 

nspr[nc] = snp[nc] / pnp[nc] 

endfor 

text = "Max (S+N)/Max (P+N) square amplitude ratios in fr. bands 
savevar data/troitsky/earthpf. bb text 
. savevar data/troitsky/earthpf.bb mspar 






















. text *s "Max(S+N)/Max(P+N) ratios for band powers ” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb mspr 
. text = " 1 Pure' MaxS/MaxP ratios for band powers ” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb mnspr 
. text = "Av (S+N) /Av (P+N) ratios for band powers " 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb spr 
. text = " 'Pure' AvS/AvP ratios for band powers” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb nspr 
. endblock of measuring 

######### SAVING SPECTRAL FEATURES FOR SIGN+NOISE ########### 
.block savepar 

. echo SAVING SPECTRAL FEATURES FOR SIGN+NOISE 

. text = ”Av(P+N) band powers” 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf .bb pp 
. text = "Max(P+N) band powers” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf.bb ppm 
. text = ”Av(S+N) band powers” 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf.bb sp 
. text = "Max (S+N) band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb spm 

. text = ”Av(P+N) ratios of band powers to total power” 

. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf .bb prb 

. text = ”Av(S+N) ratios of band powers to total power" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf.bb srb 
. text = "Av (S+N)/Av(P+N) ratios for band powers ” 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb spr 
. text = "Max(S+N)/Max(P+N) amplitude ratio" 

. savevar data/troitsky/earthpf .bb text 

. savevar data/troitsky/earthpf .bb mspar 

. text = "Max (S+N) /Max (P+N) ratios for band powers " 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb mspr 
. echo SAVING SPECTRAL FEATURES FOR FOR 'PURE’ SIGNAL 
. text = " 1 Pure ' AvP band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf. bb pnp 
. text = " * Pure' MaxP band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf. bb pnpm 
. text = "’Pure' AvS band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf. bb snp 
. text = "’Pure* MaxS band powers" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf. bb snpm 

. text = "’Pure 1 AvP ratios of band power to total power" 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb prbn 

. text = " ’Pure 1 AvS ratios of band powers to total power" 
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. savevar data/troitsky/earthpf.bb text 
. savevar data/troitsky/earthpf .bb srbn 
. text = " 'Pure' AvS/AvP ratios for band powers" 

. savevar data/troitsky/earthpf. bb text 
. savevar data/troitsky/earthpf .bb nspr 
. text = "'Pure' MaxS/MaxP ratios for band powers " 

. savevar data/troitsky/earthpf .bb text 
. savevar data/troitsky/earthpf .bb mnspr 
.endblock 
end of script 

***************************************************************************** 

Appendix 3 . 


Labels of features obtained with the help of 
"filtration" method for "pure" variant 

lprbpl avp (P, Dl) /avp (P,D0) lOmsprpl- maxp (S, Dl) /maxp (P, Dl) 

2prbp2 avp (P, D2)/avp (P, DO) llmsprp2- maxp (S, D2) /maxp (P, D2) 

3prbp3 avp (P, D3) /avp (P, DO) 12msprp3- maxp (S,D3) /maxp (P, D3) 

4prbp4 avp (P, D4) /avp (P, DO) 13msprp4- maxp (S, D4) /maxp (P, D4) 

Ssrbpl avp (S,Dl)/avp(S,DO) 14asprp0- avp(S,DO)/avp(P,DO) 

6srbp2 avp(S,D2)/avp(S,DO) 15asprpl- avp(S,Dl)/avp(P,Dl) 

7srbp3 avp (S, D3) /avp (S, DO) 16asprp2- avp (S, D2) /avp (P, D2) 

8srbp4 avp (S, D4) /avp (S, DO) 17asprp3- avp (S, D3)/avp (P, D3) 

9msprp0 maxp(S,DO)/maxp(P,DO) 18asprp4- avp(S,D4)/avp(P,D4) 

The labels of features obtained with the help of "FFT" method for "pure" 
variant 

lpsmfp fmax (P) 9sbrp3 avp (S, D3) /avp (S, DO) 

2pbrpl avp(P, Dl)/avp (P,DO) 10sbrp4 avp(S,D4)/avp(S,DO) 

3pbrp avp(P,D2)/avp(P,DO) llrmspp Maxsd(S,DO)/Maxsd(P,DO) 

4pbrp3 avp (P, D3)/avp (P, DO) 12spbrp0- avp (S, DO) /avp (P, DO) 

5pbrp4 avp(P,D4)/avp(P,DO) 13spbrpl- avp(S,Dl)/avp(P,Dl) 

6ssmfp fmax (S) 14spbrp2- avp (S, D2) /avp (P, D2) 

7sbrpl avp(S,Dl)/avp(S,DO) 15spbrp3- avp(S,D3)/avp(P,D3) 

8sbrp2 avp(S,D2)/avp(S,DO) 16spbrp4- avp(S,D4)/avp(P,D4) 

***************************************************************************** 

Appendix 4 

# SCRIPT FOR STATISTICAL CLASSIFICATION FEATURE SELECTION 

# END ERROR PROBABILITY ESTIMATION 

# script troitsky/stcl/selfeatrs.scr 
clearstack 

# INPUT DATA TO THE STACK 

# THE VARIANTS OF FEATUR SETS 
#BY THE PROGRAM "LD" 

. char[40] ldinp 

ldinp = "troitsky/ldpFb.inp" 

#ldinp = "troitsky/ldpFb+s.inp -c" 

#ldinp = "troitsky/ldtest.inp -c" 

#ldinp = "troitsky/ldnFb.inp -c" 

#ldinp = "troitsky/ldpNb.inp -c" 

#ldinp = "troitsky/ldnNb.inp -c" 

#ldinp = "troitsky/ldpFs.inp -c" 

#ldino = "troitsky/ldpFs+b.inp -c " 

#ldinp = "troitsky/ldnFs.inp ~c" 

#ldinp = "troitsky/ldpNs.inp -c" 

#ldinp = "troitsky/ldnNs.inp -c" 
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Id fildinp 
list 

# DISPAYING DATA TRACES & 

#SCATTERING DIAGRAMS 
vertical r b 27 
vertical b b 52 

plot all -y -x 
claster patr.par -c 

################## FEATURE NONLINEAR TRANSFORM. 
#BOX-COX NORMALIZING TRANSFORM, 
log all 

plot all -y -x 
. float exp 
. exp=l./7. 
power all &exp 
addc all -1. 
scale all 7. 
plot all -y -x 

##################### CALCULATION STATISTICS 

# OF LERANING DATA FOR DIFFERENT 

# DATA SETS BY THE PROGRAM "LDSTST" 

. char[40] ldststinp 

#ldststinp = "troitsky/ldststpFb.inp -c" 
#ldststinp = "troitsky/ldststpFb+s.inp -c" 
#ldststinp = "troitsky/ldststnNb.inp -c" 
#ldststinp = "troitsky/ldststnFs.inp -c" 
#ldststinp = "troitsky/ldststpNs. inp -c" 
#ldststinp = "troitsky/ldststnNs . inp -c" 
#ldststinp = "troitsky/Ids tstpNb.inp -c" 
#ldststinp = "troitsky/ldststpFs . inp -c" 
#ldststinp = "troitsky/ldststpFs+b.inp -c" 
#ldstst inp = "troitsky/ldststnFb.inp -c" 
ldstst fildststinp 
XYplot troitsky/gr.gr 
unix lpr -P0 plot/snda.ps 

# SELECTION OF THE MOST INFORMATIVE 

# FEATURES BY THE PROGRAM "FSEL" 
fsel troitsky/fsell.inp -c 
XYplot troit sky/mygr.gr 

unix lpr -P0 plot/snda.ps 

sort data/troitsky/jnum. dat 

plot all -y -x 

unix lpr -P0 plot/snda.ps 

pause 

cluster patr.par -c 

# RECLASSIFICATION OF LEARNING VECTORS 

# BY THE PROGRAM "RECLLD" 
redid troitsky/reclld.inp -c 

unix textedit data/troitsky/ldres. dat & 
unix textedit data/troitsky/qdres. dat & 

XYplot troitsky/reclplt.cnt 

# CROSS VALIDATION EXAMINATION 

# OF LEARNING VECTORS BY LINEAR & 

# QUADRATIC DISCRIMINATORS: 

#PROGRAMS "EXAMLD", "EXAMQD" 
examld troitsky/examld. inp -c 
XYplot troitsky/examld.gr 
examqd troitsky/examqd.inp -c 
XYplot troitsky/examqd.gr 

# END OF SCRIPT 
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1.9 Appendix II 

Application of statistical source identification package 
for teleseismic event discrimination. 


Appendix 1 


Performance of statistical source identification program package was also 
tested by its application to discriminating between teleseimic earthquakes and 
explosions. Seismograms of 28 underground nuclear explosions at Semipalatinsk test 
site and 33 earthquakes occurred in Eastern Kazakhstan were analyzed. Following the 
paper [6] we considered the normalized powers of P-phase and the P-coda waveforms 
in eight spectral bands, ordered in the 0.2-5,0 Hz range as the features characterizing 
shapes of P-phase and P-coda power spectra. The frequencies of spectral peaks for P- 
phase and P-coda and the ratio of maximums of P-wave and P-coda power spectra 
were also used as the additional features. The method for the most informative feature 
extraction implemented in this test was the same as described in Section. 

Only eight the most informative features were selected from the initial 19 
features. They provide the minimum of estimated probability of misclassifications. 
The dominant features were the ratio of maximums of P-phase and P-coda power 
spectra and the P-wave and P-coda normalized powers in the low frequency bands. 
The linear discriminant function was implemented for seismogram classification and 
error probability estimation based on the selected features. 

Below we designate as X1,X2,...,X8 the features, corresponding to the 
logarithms of normalized P-phase power fractions in the eighth frequency bands, as 
X9-X16- the similar P-coda power fractions, as X17, X18 - the logarithms of 
frequencies of P-phase and P-coda spectral peaks and as X19 - the logarithm of ratio 
of spectral maximums for P-phase and P-coda. 

The 18 scattering diagrams of the "basic” feature XI9 with the rest 18 
features were visually analyzed. The conclusion was made that the pairs of features 
(X10,X19) and (X1,X19) seem to be the most informative for the discrimination. The 
scattering diagram for the first pair is shown in Fig.l . The scattering diagram for the 
pair (X1,X19) is shown on Fig.2. One can see from these figures that property drawn 
straight lines can divide the earthquake and explosion clusters without any mistakes. 
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The result of automatic stepwise selection of the most informative features (the 
output of program “fsel”) is displayed in Fig.3. Dependence of error probability P(k) 
from the number k of selected features shows, that the 8 features corresponding to the 
first 8 selection steps may be chosen as the best for the discrimination. Note, that 
visually selected features 19,1 and 10 are enter to the set (19,1,17,10) of features with 
the highest ranks (19,1,17,10). The ranked values of linear discriminant function 
applied to selected features ( the output of program ’exam*) are displayed in Fig.4. 
The symbol ’O' is used for earthquakes and symbol Y - for explosions. We see that the 
all learning observations for the given learning set of explosions and earthquakes are 
classified correctly, but the robust classification zone is rather narrow. 


Appendix 2 

Neural network approach to discrimination of regional events 

based on seismic trace sonograms 

The sonogram representation of a seismogram is one of promising tools for 
seismic data analysis. The time-spectrum maps of seismic traces can reveal obscured 
signal features. These maps may be also processed by neural network tools. Standard 
feedforward neural models are widely used now for seismic events recognition [1,2]. 
The software and sonogram discrimination analysis methodology described below 
provides either interactive or batch modes of discrimination analysis built in the 
seismic array data processing system. 

The neural computer networks for image processing are available now in 
several signal processing software packages (e.g. in MATLAB). There is also a neural 
network routine in the popular SAC package for seismic data analysis. Unfoitunately 
these tools lack of capabilities of sonogram processing. The SNDA package for 
seismic array data analysis provides the capability of easy installation of user defined 
data processing programs. The following functions for sonogram discrimination 
analysis are implemented in SNDA : 

Seismic trace-sonogram conversion program, 

Sonogram visualization tool, 

Sonogram clasterisation tool by the ART2 neural network model, 

Supervised learning tool by the backpropagation neural network model, 
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Recognition tool for seismic event discrimination. 

There is also an auxiliary tool for supervised learning network adjustment. 

The sonograms of explosions and earthquakes may be discriminated by several 
features extracted from sonogram image. The experiments with explosion recognition 
showed stable discrimination within the same seismic region [2]. Unfortunately 
experiments with classification of seismograms from different regions using the same 
neural network discriminator demonstrates much less reliability. 

There may be several hypothesis in seismic event discrimination: chemical 
explosion, nuclear blast, distant earthquake etc. So the straightforward feedforward 
model for learning and recognition may be extremely difficult to use. The more 
flexible technique with several stages of data processing including clustering and 

leaming/recognition is preferable. 

We propose the following data processing technique for seismic event 
discrimination using neural network programs. It consists of the next stages. 

1. Trace to sonogram conversion; 

2. Sonogram clustering; 

3. Pattern learning; 

4. Pattern recognition; 

5. Neural network model adjustment and tuning. 

A short description of each step is given below 

1) The seismic traces in CSS or ASCII are converted to the time-frequency 
(sonogram)map image. The image size is set to 32x32 (1024) pixels. Then the patterns 
of 16x16 pixels located in a sonogram map area which is the most informative for 
discrimination are cut from eveiy image. These patterns are used then for learning, 
clustering and recognition. 

2) The learning set of sonogram patterns is applied to ART2 neural network 
with variable threshold [3,4]. This program performs clustering of the sonogram 
pattern set by assessing eveiy sonogram as belonging to one of different clusters. So 
the number of different clusters is estimated and may be used to assign the number of 
output nodes of the supervised learning network used at the next processing stage . 

3) At the stage of pattern learning the standard feedforward neural network 
with Backpropagation (or Quickpropagation) learning function [3,4] is implemented. 
The number of output attributes can be obtained from the previous stage. As a rule 
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the output attributes for learning are assumed to be binary values. So network 
designed for explosion-earthquake recognition have to have the two output nodes. 

4) A sonogram of seismic trace with unknown source type is put to the trained 
network. The "True” value of the binary output is used as a pointer for attributing 
sonogram to a signal source class. 

5) If some learning patterns are wrongly classified in the test (the output is not 
exact binary “TRUE" value) the pattern is included in the pattern set for the next 
learning cycle. The number of network outputs is incremented. The new output node 
represents the new signal class. 

The technique described was tested using seismograms of Semipalatinsk 
nuclear tests and regional earthquakes from Southern Kazakhstan registered by Talgar 
seismic observatory. Vertical component trace data were used for neural network 
algorithm training. The learning set consisted of 10 nuclear explosion patterns, 6 

earthquake patterns and 2 noise patterns. 

The data clustering was performed by ART-2 model with variable threshold. 
The learning set of seismograms was correctly divided in three clusters. 

Feedforward and Elman-Jordan neural network models were used for 
sonogram learning and recognition. The network was constructed with 256-nodes 
input layer, 2 hidden layers with 16 nodes in each, and 2-node output layer. The best 
learning performance was gained by the simple feedforward network with the 
Quickpropagation learning algorithm. The learning procedure required about 10000 
cycles. 

The following software tools were used as network prototypes [3,4]: 

SNNS v4.0, PlaNet 5.7, NBtest (X-based backpropagation-ART2 tool for 
sonogram processing). The NBtest program provides the automated network 
adjustment feature. The output layer adjustment procedure was tested in comparison 
with Cascade Correlation learning algorithm. This simplified adaptive procedure 
provides much advantage in learning cycle time. 

The sonogram evaluation program operates with “wfdisk" files of the CSS- 
format recordings. This tool picks up the arrival point and sample rate in automated 
mode. The results of classification of the seismogram learning set described above by 
the neural network algorithm is depicted in the following table (the seismogram 
patterns are denoted by CSS trace labels) 
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Event 

Class 

Neural network output 



explosion = 1 

earthquake 

8051305 

(nuclear explosion) 

1.0; 

0 . 

08081510 

(nuclear explosion) 

1.0; 

0 . 

08081604 

(nuclear explosion) 

1.0; 

0 . 

08081605 

(nuclear explosion) 

0.0; 

1 . 

08081510 

(nuclear explosion) 

1.0; 

0 . 

08081610 

(nuclear explosion) 

1.0; 

0 . 

08081609 

(nuclear explosion) 

0.0; 

1 . 

08081608 

(nuclear explosion) 

0.0; 

1 . 

08081707 

(nuclear explosion) 

1.0; 

0 . 

08050404 

(nuclear explosion) 

1.0; 

0 . 

088258040108 

(earthquake) 

0.0; 

1 . 

088292034130 

(earthquake) 

0.0; 

1 . 

089043041625 

(earthquake) 

0.0; 

1 . 

089048040240 

(earthquake) 

0.0; 

1 . 

089245041825 

(earthquake) 

0.0; 

1 . 

089292095115 

(earthquake) 

0.0; 

1 . 


We see that the all earthquakes were classified correctly, but there were 3 
mistakes in nuclear explosion classifications. This result prompts us to choose a more 
informative pattern set from event sonograms or to use additional seismogram 


features. 


References 

1. P.S. Dysart, JJ.Pulli (1990). Regional seismic event discrimination at the NORESS 
array: seismological measurements and the use of trained neural networks. Bull. Seism. 
Soc. Am. v.80, n.6, 1910-1933 

2. F.U. Dowla (1995) Neural networks in seismic discrimination. E.Husebye, A.Dainty 
(eds.) Monitoring of Comprehensive Test Ban Treaty, 111-190, NATO ASI Series, 
Kluwer Academic Publishers. 

3. SNNS, Stuttgard Neural Network Simulator, User Manual, Version 4.0, Report No 
6/95. 

4. Yoh-Han Pao (1990) Adaptive Pattern Recognition and Neural Networks, Case 
Western Reserve University, Addison- Wesley Publishing Co. Inc. 















































-2 


- 4.0 



I 


Fig. 2. Scattering diagram between features 1 and 19. 
Feature 1 is the logarithm of normalized P-phase power in 

the low frequency band. 
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Fig. 3. Results of feature arranging in the order of increasing discrimination power. 

This allows one to select the most informative features. 





























































Linear discriminant function 
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Fig. 4. Results of explosion and earthquake set discrimination with the help of 
learning data examination by the cross-validation method. 

The learning observations are classified correctly. 
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2. SCANLOC: AUTOMATIC SEISMIC EVENT LOCATION 
IN SPACE AND TIME BASED ON THE PRINCIPLES 

OF EMISSION TOMOGRAPHY 

2.1. Introduction 

Robust methods for automatic event location and origin time determination of 
numerous weak earthquakes and mining explosions at local and regional distances is an 
important part of modem CTBT monitoring technology. The goal of the current 
research is to design a reliable method for fast automatic event location based on local 
seismic network data. This is important for monitoring in industrialized or seismically 
active regions and for wide range of other seismic applications. The major effort was 
made to eliminate any need in the location procedure for the analyst verification of 
seismic wave phase detection and parameter estimation (that is routine practice in 
conventional monitoring systems). 

The conventional methods for seismic event location (and evaluation of other 
event parameters) traditionally are based on detection, identification and precise 
measurement of seismic phase parameters by individual processing of recordings from 
eveiy seismic network station (Kennett, 1996). Automatic methods designed using this 
paradigm have low computational efficiency because of difficulty to develop algorithms 
which adequately reproduce the operations of skillful geophysicist. Interestingly, the 
current methods of computer treatment of seismology data virtually do not use the 
advantages provided by multichannel data processing. From general point of view, 
registering of the event wavefield by a network consisting of identical sensors allows us 
to consider the set of recordings obtained from all receivers as a single multichannel 
seismogram. In many respects, this seismogram is similar to those conventionally used 
in seismic prospecting and deep seismic sounding. Powerful methods for optimal 
processing of multichannel data have been developed in these fields, and some of them 
can be used in solving the problems of seismic event location. Multichannel approaches 
to automatic analysis of seismic data for location puipose are under investigation by a 
number of research groups (Ryzhikov et al., 1995; Young et al., 1995; Ringdal and 
Kvama, 1989). Promising results have been obtained, demonstrating that new 
techniques specifically oriented to automated computer data processing are useful for 
solving many traditional problems in seismology. 
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In the framework of this project we developed and tested a new automatic event 
location method based on processing of multichannel recordings fiom local and 
regional seismic networks. The approach is based on the principles of seismic emission 
tomography technique. The essence of our approach is to look for a "bright spot" in the 
medium by scanning the medium area under investigation with a sounding seismic 
beam in the units of given scanning grid. An advantage of the method is that it does not 
require a detection and parameter measurement of the event seismic phases. 

The results of experimental processing of Israel regional network data obtained 
in our study showed that the method allows to determine the event epicenters and 
hypocenters with errors which do not exceed 1-3 steps of scanning giid (in oui 
experiments the errors were less than 15 km). The computation time foi scanning of a 
square area with 255 km x 255 km size while processing data from 10-20 sensors with 
sampling rate 50 Hz does not exceed 10-20 sec for a SPARC-2 Workstation. Thus, a 
near real-time automatic location technique could be designed on this basis. 


2.2. Analytical formulation 

Emission tomography proper is the method for reconstructing the object internal 
stiucture through the use of signals emitted by sources located within the medium. 
There are number of methods for solving this problem. In contrast to classic 
tomography based on solution of the inverse Radon's problem we obtain 3 dimensional 
(3D) image of emission’s sources by scanning the medium volume under study by a 
sounding beam being formed from seismic array recording. This appioach was pioposed 
by A.V.Nikolaev and P.A.Troitsky, 1987. Lately this method was successfully 
implemented for mapping the zones of microseismic activity in hydrotlieimal aiea in 
NE Iceland (Shoubik et al., 1991; Shoubik and Kiselevich, 1993; Gurevich et al., 1994, 
Shoubik et al., 1996). The presence of microseismic sources or contrast inhomogeneties 
within the Earth results in the appearance of coherent time distributed components in a 
stochastic seismic wave field recorded on the surface by seismic array. By careful 
processing of seismic array data these coherent components can be used to develop a 
3D model of microseismic activity in the medium under study or an image of noise 
radiating objects. 

The essence of the processing algorithm consists of comparative assessment of 
coherent signal power irradiated from different points (or small volumes) of the aiea 
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under study. The area is portioned into an equidistant scanning grid and values of beam 
power are calculated for every unit of the grid. The beam power are assessed by 
observed value of Signal To Noise Ratio (SNR). If coordinates of some unit are equal 
or closest to the true emitter coordinates the SNR value calculated for this unit exceeds 
the SNR values calculated for the adjacent points. The set of calculated SNR values ( 
2D or 3D SNR- map) reflects a spatial distribution of emitters in the area under study. 
The main idea of the current research consists in the use of above formulated approach 
to spatial and temporal location of natural and artificial seismic events sources rather 
then seismic noise sources. 

Fig.l illustrates the principle of the method. P h p 2,-, p m,-, p M denote sensors 
of seismic array. The volume (V) below the array is scanned by beam formed from 
seismic array Pi,...,Pm in the units of scanning grid. Xj,Yj,Z £ are coordinates of the grid 
unit and <i>pris the signal irradiated from this point. 

The processing algorithm is based on a linear additive model of signals and noise 
at the network sensors: 

fm(0 ~ V-mijk ^ijkfP^mijk) “*"£ m(0 0) 

where: 

m = 1,2 ,...M is the number of sensors; 

f m (t) is the seismic trace recorded by the wi-th sensor; 1 is the receiver time; 

§ijk(t) is a signal emitted from the source in the grid unit with coordinates Xj, Yj, Z/y 
a tfiijic is the amplitude decay factor due to geometric spreading, angle of incidence and 
attenuation (absoiption and scattering) of the signal §jjkO) at the path from grid unit 
with coordinates Xj, Yj, to m-th sensor; 
x m ijlc is the time delay determined by the signal travel time; 

e m (t) is the sum of stochastic noise and signals emitted from other sources not 
coinciding with Xj, Yj,Zjc. 

The location problem is reduced to a comparative assessment of the energy of 
signals (<j>^) radiated by different elements (ij,k) of the medium volume under study 
( V). In order to assess the energy of the correlated components tyjjk(t-imijk) presenting 
in the multichannel recording (f m (t), m=l,2,...,M) the two functional are most 
commonly used: Semblance (ri) (Nadel and Taner, 1971) and Signal to Noise Ratio 
(SNR) (Katz and Shoubik, 1986). Let us designate: 

























4 


M 

AijkOn) "I fimijk fmOn+^mijk) (2) 

tn-l 

M 

BjjkOn) = {P mij'k fmOn^^mijk) (3)> 

m -1 

then the Semblance functional is equal to: 

N N 

Sjjk ~ AjjkOn) / M Bjji/tn) (4) 

n=l n -1 

and the SNR functional is equal to: 

N N 

SNRjjfc — ^ (AjkOn) ~ Bjji^tn)) / ^ (M Bjji/tjj) - AjjkOn)) (5) 
n -1 «=1 

where: 

$ mi j k is the amplitude normalizing factor theoretically or experimentally chosen to 
approximate the reciprocal value of the amplitude decay factor o.mijk f'°i tire signal 

radiated from the medium element at Xj,Yj,Z k (i.e., $ m jjk ~ l/a ml jk); 

N is the number of samples in the time window for which the estimate of an average 

correlated signal power is calculated; 

A m ijk is the estimation of time delay i m ]k, of the signal due to propagation from the 
(ij,k) volume element to the m-th receiver. This estimation computed for the a priori 
medium velocity model. 

It can be shown that for a linear additive model (1) and uncorrelated random 
noise t m (t) in different sensors, the Semblance (S) converges to the ratio of coherent 
signal power to total average seismogram power, varying from 0 to 1; the Signal to 
Noise Ratio (SNR) converges to the ratio of coherent signal power to noise power, 
varying from 0 to infinity. If the ( ij,k ) medium element contains the seismic souice, 
the SNR value calculated for this element exceeds the SNR values calculated for 
neighboring grid points. The set of SNR values (5) calculated for every point Xj,Yj,Z k of 
the scanned medium surface area or the scanned medium volume produces 2D or 3D 
iSWR-map. This map reflects the spatial distribution of seismic sources in the region 

under study. 
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We attempted to employ the above described emission tomography method, 
initially formulated and successively applied for seismic noise analysis in the rather 
small areas of microseismic activity, for the search of seismic event source as a radiator 
of seismic energy in the significantly larger Earth crust area. The method described 
above may be thought of as scanning the medium by a sounding beam formed by 
seismic network “antenna”. 

Actual implementation of these principles vaiy depending on the epicenter 
distance and the seismic network scale (regional network, local network, array with 
aperture of first dozen km). Array data from seismic events typically exhibit coherency 
(correlation) between signals recorded at different array sensors, at least a certain 
frequency range. In this situation it would probably be effective to use coherent stacking 
in (2) to scan the medium area. However, this is likely to be ineffective in local or 
regional seismic networks. In this case the incoherent analysis in course of the grid 
scanning should be preferable. In other words, one should use a suitable mask-filter 
(e.g., Shoubik, 1980) to transform the initial wave forms (1) into low frequency mask 
signals. As such low frequency signal models the functional of envelope function, 
STA/LTA function and of output signals of proper detector (polarization, amplitude, 
spectral, etc.), can be used. 

2.3. SCANLOC program package and results of data processing 

Following the above described approach the program package SCANLOC was 
developed and first results obtained in processing data recorded by the Israel Regional 
Seismic Network (ISN). ISN is operated by the Seismological Division of the Institute 
for Petroleum Research and Geophysics (IPRG) in Holon, Israel. The database 
supplied with ground truth information was collected by Dr. Y.Gitterman (IPRG) and 
used in the discrimination study (Y.Gitterman and T. van Eck, 1993). The waveforms 
and ISN bulletin information were kindly prepared and transferred by Dr. V.Pinsky 
(IPRG). 

The SCANLOC package oriented to automatic event location includes the next 
main programs: 

- Foiward modeling using an a priori velocity model of a stratified laterally 
homogeneous medium. The procedure calculates the time delays, angles of incidence 
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and amplitude decay factor the specified number of seismic phases and for a given set 
of grid point and station coordinates. 

- One-channel low frequency mask-filter. If the incoherent stacking is used the 
procedure transforms the original waveforms to a low frequency signal models. 

- Grid scanning procedure. The procedure calculates the 2D or 3D ATVTTmaps, i.e., the 
set of SNR values (5) for all grid units. This SNR -map reflects the spatial distribution of 
seismic emitters in the area under study. The quality and accuracy of the event location 
can be assessed by relationship between the maximum map SNR value and the estimate 

of SNR dispersion through the total map. 

An efficient methods for scanning through the grid points and for forward 
modeling were implemented in the package. In the current version of the SCANLOC 
package the computing time for processing data from 10-20 stations with sampling rate 
50 Hz while scanning about 3000 grid points does not exceed 10-20 sec for a SPARC-2 
Workstation. This means that it is possible to develop a near real-time automatic 

location procedure. 

Note that the current version of the SCANLOC package developed in this 
project presents only the first stage of investigation of a new method for event location. 
This version was adapted to ISN data which were accessible to us in this project and it 
demonstrated a promising results on these data. These results encourage us to continue 
investigation in this direction. A wide testing, adjusting and improving of the package 
based on a range of medium models, network’s data with different event types and 
signal/noise levels, etc. should be carried out in order to obtain a high efficiency 
program product. 

This version of SCANLOC package is written in standard C language and 
interacts with the Seismic Processing Shell SNDA (Kuslmir et al., 1995) by using of 
compatible formats of input and output files and by employing the various multichannel 
data handling and interactive graphic SNDA tools. This version is available now as the 
object module included in the latest version of the SNDA System. Descriptions of the 
input and output files of SCANLOC package and the SNDA JCL scripts used for the 
ISN data preparing and processing are presented in the Appendix to this chapter 

(infloc.inp, ecsel.scr, locproc.scr, locview.scr). 

The package is working with standard input and output files. The input data file 
named “indat” contents the original event seismograms selected from ISN database 
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with the help of SNDA script “recsel.scr”. These seismograms are presented in the 
internal SNDA data format and saved in files with the extension “*.pk”. Another 
standard input file named “infloc.inp” contents parameters in the ASCII format 
inquired by current version of the SCANLOC program for processing “indat” 
recordings. The description of these parameters, their recommended and margin values 
are given in the Appendix to this Section. Standard output file named “outdat” contents 
output data matrix in ASCII format (SNR- map) calculated by the SCANLOC program 
for seismogram from file “indat” and the parameter values from file “infloc.inp”. The 
data in “outdat” file are presented in the ascending order of scanning area coordinates 
Xj (first index) and Yj (second index). This format is compatible with SNDA “surfer” 
program. 

The seismograms of 19 events recorded by ISN were processed during the course 
of this project. For this experiment only good quality station recordings were selected 
for every events from the available ISN database. The list of the events used with 
ground truth information is presented below in Table 1. This table contains the number, 
date, magnitude, geographical coordinates and hypocenter depth for 19 selected events. 
This information selected from IPRG event catalogue file. SNDA script “recsel.scr” 
was used for seismograms screening, selecting and obtaining the local coordinates of 
selected stations. The number of stations being selected for further data processing 
varied from 8 stations up to 24 stations. For solving the epicenters and hypocenters 
location tasks we have used the homogeneous a priori seismic velocity model with 
velocities of P and S waves equal to: Vp = 6.196 km/sec and Vs — 3367 km/sec. The 
validity of simplest media model at local distances and the velocity were proved by the 
regression analysis of onset times of P and S phases measured at seismograms of 
numerous local earthquakes and explosions recorded by the Israel Seismic Network. 
The peculiarities of this analysis (performed by Dr. V.V.Starostin) are described in 
Section 2.8. The epicenters of the all 19 selected events are located within the 255km x 
255km area. This area was covered by the scanning grid with the step equal to 5km (51 
x 51=2601 grid units were used). 

Some results of event epicenter and hypocenter location are presented in Table 
2 and illustrated by Fig. 2(a,b,c,d,e), Fig 3(a,b,c,d). Table 2 contains list of processed 
events. Event date and time are given in the second column by a 10 digit event label 
containing the year, month, day, hour, minute (2 digits each). The third column shows 
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the number of stations which recorded specified event and were selected from ISN 
database for processing by SCANLOC method. The 4th and 5th columns contain the 
event coordinates (Xcat, Ycat) from the catalogue file produced with the help of 
conventional location procedure implemented at IPRG, and 6th, 7th, 8th and 9th 
columns contain the coordinates (Xml, Yml) corresponding the maxima of event 
SNR-maps, and the coordinates (Xm2, Ym2) of the map points which values are the 
nearest to maximum. The latter seive as some measure of map dispersion (they 
characterize the width of map peaks). The coordinates are given in a local orthogonal 
system: the geographical coordinates of the system center (0.0, 0.0) coincide with 
coordinates of the epicenter of first processed event and equal to LAT — 32.67IN, 
LON = 35.260E. The last column of Table2 shows the epicenter location error rounded 
off to integer km. One may see that 80% of the location errors do not exceed the 
scanning grid interval (5 km). 

Fig. 2 (a,b,c,d,e) and Fig. 3 (a,b,c,d) illustrate the epicenter and hypocenter 
location procedure for 2 from 19 events in more detail. Fig. 2 illustrates some 
processing results for the first event from Table 1 (event 8710071514). The deployment 
of stations which recorded the event seismograms is shown in Fig. 2a. The circle on the 
map marks both the origin of coordinates and the event epicenter, as discussed above. 
The event seismograms are shown in Fig. 2b. The seismograms are ordered according to 
the station-epicenter distance. The SCANLOC output SNR -map is presented in Fig. 2c 
in the form of a mesh perspective diagram and in Fig. 2d in the form of a contour map. 
Some results of hypocenter location for this event are shown in Fig. 2e. Here are 
presented 9 67V7?-maps for scanning squares located on the next depths: 0 km, 5 km, 10 
km, 15 km, 20 km, 25 km, 30 km, 40 km and 50 km. The values of SNR -map maxima 
are shown below each SNR- map. One can see that the general pattern of SNR 
functional spatial distributions is rather stable up to 40 km depth and it is completely 
destroyed below 40 km depth. The value of SNR -map maxima (printed below every 
SNR-map) monotonically increase with the depth growing up to depth about 25 km 
and they decrease for depths lower this level. Also one can notice that the ratio of the 
maximum SNR value to the dispersion of SNR-map also increase with the depth 
growing up to 25 km and it sharply decreases below this depth. These observations 
demonstrate that the developed event location method could be very useful tool for near 
real time evaluation of event depth. The depth of event hypocenter is determined in the 
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catalogue as 12± 1.1 kin. Most likely this deviation in hypocenter location is related 
with inaccuracy of the simplest seismic velocity model used. It is important to note that 
a priori medium velocity model can be refined in the framework of SCANLOC event 
location method: the peak value of calculated SNR-imp can serve as a performance 
criterion of the velocity model used: a better velocity model provides a greater peak. 

Fig. 3 illustrates some processing results for event No 13 from Table 2 (event 
9104150120). The deployment of stations which recorded the event seismograms is 
shown in Fig. 3a. The circle in the map marks the event epicenter. The event 
seismograms are shown in Fig. 3b. The seismograms are ordered according to the 
station-epicenter distances. The SCANLOC output SNR-imp is presented in Fig. 3c in 
the form of a mesh perspective diagram and in Fig. 3d in the form of a contour map. 

2.4 Conclusions and recommendations 

1) . The new SCANLOC method for automated event location based on local 
seismic network data was developed and tested. The processing is founded on the 
principles of seismic emission tomography. The results of experiments with Israel Local 
Network data revealed the high reliability and precision of weak local event epicenter 
and hypocenter location by this technique. The software written can serve as the basis 
for developing a near real-time location tool for seismic monitoring at local and 
regional distances. A wide testing, adjusting and improving of the package based on a 
range of medium models, network’s data with different event types and signal/noise 
levels, etc. should be carried out in order to obtain a high efficiency program product. 

2) . It is possible to refine the medium velocity model within the framework of 
the event location method. The peak value of calculated SNR-map can seive as a 
performance criterion of the velocity model used: a better velocity model provides a 
greater peak. 

3). The processing approach can be implemented for coherent and incoherent 
multichannel seismogram analysis. The precision of weak event location can be 
increased by implementing one-channel seismic phase detectors which are more 
sensitive than the conventional STA/LTA detector used in our experiments. 
























2.5 Figures and tables 

Table 1. 

List of events, selected for processing by the SCANLOC program 


No 

Year 

Month 

Day 

Origin time 

M 

Lat 

Long 


1 

1987 

10 

07 

15:15:4.7 

1.9 

32.671N 

35.260E 

mwaami 

2 

1988 

02 

24 

15:37:26.6 

1.5 

32.717N 

35.251E 

HiHH 

3 

1990 

08 

21 

6:22:23.0 

1.8 

32.663N 

35.152E 

Hgim 

HI 

1990 

09 

16 

9:41:44.4 

1.6 

32.961N 

34.98 IE 

m 


1990 

11 

17 

7:30:16.1 

1.2 

32.792N 

35.270E 

1UMM 

6 

1990 

12 

21 

15:24:50.4 

1.5 

32.853N 

35.571E 

nna 

7 

1991 

01 

09 

2:30:42.2 

1.1 

32.781N 

35.272E 

20+/-1.6 

8 

1991 

01 

26 

17:46:55.7 

2.6 

32.782N 

35.273E 

■wraatai 

9 

1991 

01 

27 

3:5:37.2 

1.5 

32.808N 

35.324E 


10 

1991 

02 

12 

8:32:56.9 

1.4 

32.857N 

35.468E 

8.0+/-1.5 

11 

1991 

02 

25 

6:33:55.4 

2.0 

32.581N 

35.306E 

WHBH 

12 

1991 

04 

07 

17:18:19.7 

1.3 

32.845N 

35.584E 


13 

1991 

04 

15 

1:21:28.3 

HI 

32.848N 

35.594E 


14 

1991 

04 

15 

5:3:49.8 

1.5 

32.826N 

35.574E 


15 

1991 

04 

16 

6:38:1.3 

1.9 

32.844N 

35.587E 

MtHBII 

16 

1991 

04 

27 

7:13:22.3 

1.3 

32.852N 

35.580E 

0.0+/-3.9 

17 

1991 

05 

01 

20:47:12.1 

2.2 

32.835N 

35.578E 

6.0+/-1.3 

18 

1991 

05 

03 

22:7:19.2 

1.0 

32.830N 

35.583E 

6.0+/-5.6 

19 

1991 

05 

16 

2:50:17.2 

1.7 

33.080N 

34.981E 

10+/-1.3 


Table2. 

Results of event location by the SCANLOC program 


No 

Event Date 

Num. 
of stat 

Xcat km 

Ycat km 

Xml km 

Yml kin 

Xm2 km 

Ym2 km 


1 

8710071514 

8 

0.0 

0.0 

0.0 

0.0 

0.0 

5.0 

0 

2 

8802241537 

8 

0.844 

5.101 

-5.0 

5.0 

0.0 

0.0 

4 

3 

9008210621 

14 

-10.13 

-0.882 

-5.0 

-15.0 

-10.0 

-5.0 

15 

4 

9009160941 

12 

-26.08 

32.196 

-30.0 

35.0 

-30.0 

30.0 

5 

5 

9011170729 

10 

0.937 

13.419 

0.0 

10.0 

5.0 

10.0 

3 

6 

9012211524 

13 

29.113 

20.227 

30.0 

20.0 

30.0 

25.0 

0 

7 

9101090230 

12 

1.124 

12.199 

5.0 

10.0 

-5.0 

10.0 

4 

8 

9101261746 

9 

1.218 

12.310 

0.0 

15.0 

-5.0 

15.0 

3 

9 

9101270305 

9 

5.994 

15.195 

-5.0 

15.0 

5.0 

15.0 

10 

10 

9102120832 

21 

19.470 

20.647 

25.0 

20.0 

20.0 

20.0 

5 

11 

9102250633 

23 

4.319 

-9.980 

5.0 

-10.0 

0.0 

-15.0 

1 

12 

9104071717 

12 

30.332 

19.344 

30.0 

20.0 

35.0 

15.0 

1 

13 

9104150120 

24 

31.268 

19.679 

35.0 

15.0 

35.0 

20.0 

5 

14 

9104150503 

15 

29.403 

17.234 

30.0 

20.0 

-35.0 

5.0 

3 

15 

9104160637 

19 

30.614 

19.234 

40.0 

20.0 

45.0 

20.0 

10 

16 

9104270713 

9 

29.956 

20.119 

30.0 

20.0 

30.0 

15.0 

1 

17 

9105012046 

24 

29.774 

18.233 

35.0 

15.0 

40.0 

20.0 

6 

18 

9105032206 

11 

30.244 

17.680 

35.0 

15.0 

30.0 

20.0 

5 

19 

9105160249 

20 

26.050 

45.395 

-35.0 

45.0 

-30.0 

45.0 

11 













































































































































































































































































































































































Fig. 2a. 


























8 selected Israel Seismic Network stations. 




























































0.605 - 0.643 



output A/V/t-map, calculated by SCANLOC program 
Scanning grid area is 255 km x 255 km, 
grid intervals are 5 km, depth is 0 km. 


























contour SAW-map, calculated by SCANLOC program 
scanning grid area is 255 km x 255 km, 
rid intervals are 5 km. denth is n km 
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Mon Aug 5 16:09:34 1996 SYNAPSE Science Center 


Fig* 2e. The output STVff-map, calculated by SCANLOC program 
The scanning grid area is 255 km x 255 km, grid intervals are 5 km, 
the depths are: Okm, 5km, 10km, 15km, 20km, 25km, 30km, 40km, 50km 

















































































































Fig. 3b. The seismogram of event 9104150120 recorded by 
24 selected Israel Seismic Network stations. 





































0.323 - 0.340 



Fig. 3c. The output SWtf-map, calculated by SCANLOC program 
The scanning grid area is 255 km x 255 km, 
grid intervals are 5 km, depth is 0 km. 

























it contour SNR- map, calculated by SCANLOC program 
e scanning grid area is 255 km x 255 km, 
grid intervals are 5 km, depth is 0 km. 
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2.7. Appendix 

Descriptions of SNDA JCL scripts and SCANLOC input file 


Script “recsel.scr” for screening and selecting 
the Israel Local Seismic Network seiamograms 
to further processing by the SCANLOC package: 


#script boris/recsel.scr 

#This script is designed to screening and selecting the ISN 
#recordings to further processing by SCANLOC package 


. char dt[29] 

. && dt[1] = 

. && dt[3] = 

. && dt [5] = 

. & & dt [ 7 ] = 

. & & dt [ 9 ] ” 

. && dt[11] = 
. && dt[13] = 
. && dt[15] = 
. && dt[17] = 
. && dt[19] = 
. && dt[21] = 

. && dt[23] = 

. & & dt[25] = 

. & St dt [27] = 

. char path[] 


[ 10 ] 


"8710071514"; 

dt[2] 

= 

"8802241537" 

"8803031315"; 

dt[4] 

= 

"8904140553" 

"9008112214"; 

dt [6] 

= 

"9008210621" 

"9009041643"; 

dt [8] 

— 

"9009160941" 

"9011170729"; 

dt[10] 

= 

"9012200001" 

"9012211524"; 

dt [12] 

— 

"9101090230" 

"9101261746"; 

dt[14] 

— 

"9101261902" 

"9101270305"; 

dt[16] 

— 

"9102120832" 

"9102250633"; 

dt[18] 

— 

"9104051807" 

"9104071717"; 

dt[20] 

= 

"9104150120" 

"9104150503"; 

dt [22] 

= 

"9104160637" 

"9104270713"; 

dt [24] 

= 

"9105011435" 

"9105012046"; 

dt [26] 

~ 

"9105032206" 

"9105160249"; 

dt [28] 

— 

"8808061449" 


= "/detseis/seis/alex/data/Israel/earthq/" 


. int I 


. for (i=l; i<2 9 ; i= i+1) 

map plot/israelregcom.par 

StSt clearstack; readpack &path.&dt[i] . .pk 
St St episortl; plot all -y 

echo PLEASE ASSIGN THE TIME WINDOW and NUMBERS OF CHANNELS 
pause 

winon fisndfl &sndf2 

&& keep (Ssndcl); cut all; plot all -y 
flist all data/boris/&dt[i].locb.flist 
pause 

echo YOU MAY PUSH "READ" BUTTON to display MAP 
savepack fipath.&dt[i].locb.pk 
. endfor 
end of script 


Appendix 1 


CHOOSED: 
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Appendix 2 


Script “locproc.scr” for processing the preliminary selected seismograms 
of Israel Local Seismic Network by the SCANLOC package in order 
to obtain the spatial distribution of seismic sources 
located in the scanning area. 


# script boris/locproc.scr 

# This script is designed to processing the preliminary selected 

# 19 ISN recordings by SCANLOC package in order to obtain spatial 

# distribution of seismic sources located in the scanning area. 


char dt[29][10], 

dn [29] 

[ 10 ] 

& & 

dt [ 1 ] 

= 

"8710071514”; 

dn [ 1 ] 


" 1 " 

&& 

dt [ 2 ] 

= 

,, 8802241537 M ; 

dn [ 2 ] 

= 

ii 2 n 

&& 

dt[3] 

SS 

"9008210621"; 

dn [3] 

— 

11311 

&& 

dt[4] 

ss 

,, 9009160941"; 

dn [4] 

— 

11411 

&& 

dt [5] 

= 

u 9011170729 n ; 

dn [5] 

— 

"5" 

&& 

dt [ 6 ] 


"9012211524"; 

dn [ 6 ] 

— 

” 6 " 

&& 

dt[7] 

— 

"9101090230"; 

dn [7] 

zz 

11 'i 11 

&& 

dt [ 8 ] 

= 

"9101261746"; 

dn [ 8 ] 


iigu 

&& 

dt[9] 

= 

"9101270305"; 

dn [9] 

— 

"9" 

&& 

dt[ 10 ] 

= 

"9102120832"; 

dn[ 10 ] 

= 

" 10 " 

&& 

dt[ 11 ] 

— 

"9102250633"; 

dn[ 11 ] 

-s 

" 11 " 

&& 

dt [ 12 ] 

= 

"9104071717"; 

dn [ 12 ] 


" 12 " 

&& 

dt[13] 

== 

"9104150120"; 

dn[13] 

— 

"13" 

&& 

dt[14] 

= 

"9104150503"; 

dn[14] 

— 

"14" 

&& 

dt [15] 

= 

"9104160637"; 

dn[15] 


"15" 

&& 

dt[16] 

= 

"9104270713"; 

dn[16] 

— 

"16" 

&& 

dt[17] 

= 

"9105012046"; 

dn[17] 

— 

"17" 

&& 

dt[18] 


"9105032206"; 

dn[18] 

_ 

"18" 

&& 

dt[19] 

= 

"9105160249"; 

dn[19] 

— 

"19" 


. char pathl[80] = "/detseis/seis/alex/data/israel/earthq/" 
. char path2[80] = "/detseis/seis/aset/boris/scngr/" 

. int I 


. for ( 1 = 1 ; i <2 0 ;i= i+ 1 ) 
clearstack 

readpack fipathl.&dt[i].locb.pk 
savessa &path 2 .indat 

unix cp &path 2 . ifisrloc&dn [i] &path 2 . inf loc . inp 
unix &path 2 .scanloc 
unix cp &path 2 .outdat 
. endfor 
end of script 


&path2.outSMAPloc&dn[i] 
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Appendix 3 


Script “locview.scr”for screening the original seismograms, 
mapping of selected station deployments and imaging of 57Vi?-maps 
calculated by the SCANLOC package for 19 selected events 
registered by the Israel Local Seismic Network 


# script boris/locview.scr 

# This script is designed to screening the original recording, 

# map of recorded seismic stations deployment and SNR-maps calculated by 

# the SCANLOC package on the 19 ISN recordings 


char dt[29][10], 

dn [29] 

[ 10 ] 

&& 

dt[l] 

= 

"8710071514"; 

dn [ 1 ] 

rs 

" 1 M 

&& 

dt [ 2 ] 

— 

"8802241537"; 

dn [ 2 ] 

r= 

” 2 " 

&& 

dt [3] 

= 

"9008210621"; 

dn [3] 

= 

it 3 ii 

&& 

dt[4] 

= 

"9009160941"; 

dn [4] 

— 

"4" 

&& 

dt [5] 

~ 

"9011170729"; 

dn [5] 

r= 

H 5 " 

&& 

dt [ 6 ] 

as 

"9012211524"; 

dn [ 6 ] 

= 

" 6 " 

&& 

dt [7] 

— 

"9101090230"; 

dn [7] 

= 

1 ! II 

&& 

dt [ 8 ] 

— 

"9101261746"; 

dn [ 8 ] 

— 

|| 8 " 

&& 

dt [9] 


"9101270305"; 

dn [9] 

— 

H 9*1 

&& 

dt[ 10 ] 

= 

"9102120832"; 

dn[ 10 ] 

— 

" 10 " 

&& 

dt [ 11 ] 

as 

"9102250633"; 

dn[ 11 ] 


" 11 " 

&& 

dt[ 12 ] 

— 

"9104071717"; 

dn [ 12 ] 


" 12 " 

&& 

dt[13] 

= 

"9104150120"; 

dn[13] 

= 

"13" 

&& 

dt[14] 

= 

"9104150503"; 

dn[14] 

as 

"14" 

&& 

dt[15] 

= 

"9104160637"; 

dn[15] 

— 

"15" 

&& 

dt[16] 

— 

"9104270713"; 

dn[16] 

— 

"16" 

&& 

dt [17] 

= 

"9105012046"; 

dn [17] 

— 

"17" 

&& 

dt[18] 

as 

"9105032206"; 

dn [18] 

= 

H 18 " 

& & 

dt[19] 

— 

"9105160249"; 

dn[19] 

— 

"19" 


. char pathl[] = "/detseis/seis/alex/data/israel/earthq/" 

. char path2[] = "/detseis/seis/aset/boris/scngr/" 

. char path3[] = "data/israel/earthq/" 

. char path4[] = "/home/lap/snda/sun4/scr/boris/" 

. int i 

. for (i=l; i< 20; i= i+1) 
unix rm &path2 . OUT 
clearstack 

readpack &pathl.&dt [i] .locb.pk 
plot all -y 

flist all data/boris/maskll 

unix cp &path2.outSMAPloc&dn[i] &path4,OUT 
when (i=l) 

map plot/israelregcoml.par 
surfer &path4.surmapOUT.par 
. endwhen 
pause 
. endfor 
end of script 
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Appendix 4 


Example of input file “infloc.inp” for SCANLOC program 
intended for processing of event 9104150120 seismograms 


4700 

0.02 

24 

2 6196. 

-125000. 
-125000. 

0 . 

3367. 

5000. 

5000. 

1000 . 

51 

51 

1 



NUMBER OF SAMPLES 

SAMPLING INTERVAL 

NUMBER OF STATIONS 

NUMBER OF PHASES AND PHASE VELOCITIES 
Y-AXIS GRID PARAMETERS 

X-AXIS GRID PARAMETERS 

Z-AXIS GRID PARAMETERS 

37783 

4952 

0.0 

1.0 

GLH 


14487 

35500 

0.0 

1.0 

JRMK 


52904 

35188 

0.0 

1.0 

KSHT 


1686 

16858 

0.0 

1.0 

ATZ 


1969 

2884 

0.0 

1.0 

HRSH 


13617 

-11635 

0.0 

1.0 

GVMR 


-2054 

45360 

0.0 

1.0 

ADI 


15047 

-25828 

0.0 

1.0 

MML 


45747 

66319 

0.0 

1.0 

HRI 


-10149 

-18515 

0.0 

1.0 

MAMI 


-24464 

7239 

0.0 

1.0 

BRN 


24876 

-45548 

0.0 

1.0 

HMDT 

STATIONS INFORMATION 

-21583 

-47995 

0.0 

1.0 

ZNT 


8322 

-81726 

o 

* 

o 

1.0 

JVI 


-16112 

-105224 

0.0 

1.0 

BGI 


12247 

-122194 

0.0 

1.0 

DSI 


-13797 

-146143 

0.0 

1 .0 

YTIR 


-10416 

-191051 

0.0 

1.0 

MKT 


-55653 

-179933 

0.0 

1.0 

RTMM 


-45367 

-185637 

0.0 

1.0 

MASH 


-73832 

-185699 

0.0 

1.0 

KER 


-60283 

-239448 

0.0 

1.0 

RMN 


-25094 

-257213 

0.0 

1.0 

PRNI 


-57100 

9104150120 

-273390 

0.0 

1.0 

SGI 

LABEL OF EVENT 


Explanation of program input parameters: 

NUMBER OF SAMPLES 

This string defines the number of samples containing in the each seismic trace of 
processed seismogram. In the our experiments with ISN data processing this parameter 
took a value from range 800 - 5000. This value is limited by available computer memory 
capacity only. 

SAMPLING INTERVAL 

This string defines the each seismic trace sampling rate expressed in second. In the our 
experiments with ISN data processing this parameter took a value 0.02 sec. 

NUMBER OF STATIONS 

This string defines the number of seismic traces containing in the processed 
seismogram. In the our experiments with ISN data processing this parameter took a 
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value from range 8 - 24. This value is limited by available computer memory capacity 
only. 

NUMBER OF PHASES AND PHASE VELOCITIES 

This string defines the number of seismic phases (first parameter) and their velocities 
expressed in m/sec (following parameters). If the incoherent stacking is used the first 
parameter determined by the number of seismic phases which could be detected by one 
channel mask-filter. Phase velocities determined by the a priory velocity model of the 
medium. In the our experiments with ISN data processing we have used 2 phases (P 
and S) and the homogeneous a priori seismic velocity model with velocities of P and S 
waves equal to: Vp = 6196 m/sec and Vs — 3367 m/sec. 

Y-AXIS GRID PARAMETERS 

This string defines the part of scanning grid parameters. The first parameter is the 
beginning value of scanning grid coordinates by Y axis expressed in meter. The second 
parameter is the grid step expressed in meter. Coordinates are given in the local 
orthogonal system. The third parameter is the number of grid units by the Y axis. In the 
our experiments with ISN data processing we have specified this value fiom 1 to 100 

grid units by Y axis. 

X-AXIS GRID PARAMETERS 

This string defines the part of scanning grid parameters. The first parameter is the 
beginning value of scanning grid coordinates by X axis expressed in meter. The second 
parameter is the grid step expressed in meter. Coordinates are given in the local 
orthogonal system. The third parameter is the number of grid units by the X axis. In the 
our experiments with ISN data processing we have specified this value from 1 to 100 
grid units by X axis. 

Z-AXIS GRID PARAMETERS 

This string defines the part of scanning grid parameters. The first parameter is the 
beginning value of scanning grid coordinates by Z axis expressed in meter. The second 
parameter is the grid step expressed in meter. Coordinates are given in the local 
orthogonal system. The third parameter is the number of grid units by the Z axis. In the 
our experiments with ISN data processing we have specified this value from 1 to 100 
grid units by Z axis. 

STATION INFORMATION 

This section of infloc.inp file contains the strings which number are equal to number of 
station. Each string includes 5 parameters. The first parameter is the X coordinate of 
station expressed in meter. The second parameter is the Y coordinate of station 
expressed in meter. The third parameter is the Z coordinate of station (the station 
elevation with respect to sea level) expressed in meter. The coordinates of seismic 
stations are defined in the local orthogonal system which is the same as coordinate 
system to scanning grid units. The forth parameter may take two values: 0 or 1. If it is 
equal to 0 the data of corresponding station does not take part in event location 
procedure. Sometimes it could be necessary to exclude part of data from further 
processing by arbitrary reasons. The fifth parameter is the short name of station. 

LABEL OF EVENT 

This string contents the identification information about processed event. 
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2.8. Study of Israel regional velocity model 
on the basis of Israel Seismic Network catalogue 

of local events. 

The travel time tables of seismic wave phases typical for a given region are 
necessaiy for the seismic event location based on seismic array and local network data. 
The accuracy of event hypocenter determination depends on correspondence of phase 
travel times estimated from the tables used in the location procedure to the real phase 
time delays for the event under study. The regional travel time tables are usually created 
in the result of seismic sounding the regional medium by artificial (mainly explosion) 
sources. Such tables turns to be as a rule rather “smoothed” and not always adequate to 
the real medium structure in different areas of the region. Detailing of the medium 
structure and revealing of the local medium heterogeneity demands rather expensive 
field experimental studies. At the same time the estimates of onset times of wave phases 
from regional and local events can be utilized for this purpose. If collected statistic 
become representative enough then for eveiy source-station pair the corrections of the 
preliminary time travel table can be calculated. 

where fy is the interpolated value of travel time table, 5,y is the correction obtained 
from experiments. Such work for travel time table correction are conducting at many 
seismological organizations running local seismic networks for a long time. If the 
structure of earth medium in the area of seismic network deployment is homogeneous 
enough then analytic equations can be constructed for regional phase travel time cuives 
against an epicenter distance. This can be done if sufficient amount of arrival time 
estimates is accumulated in the results of long period of observations. 

We made the attempt to determine such equations for the Israel region using the 
method of regression analysis applied to data acquired by Seismological Division of the 
Institute for Petroleum Researches and Geophysics (IPRG) in Holon, Israel, during the 
period 1987 - 1991. The deployment of Israel Local Network stations is shown at the 
map in Fig.4. The source locations of weak earthquake and explosion events contained 
in the available IPRG data base are depicted at the map in Fig.5. 

The puiposes of our study were: 
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1) to check the correspondence between the theoretical regional travel time table 
available from NORSAR publications and real travel times obseived from seismic 
phases detected in Israel events. 

2) to correct the regional travel time table in accordance with results of observations. 

3) to derive analytic equations for travel time curves in depending on epicenter distance; 
these equations could be used in fast algorithms for event location by scanning of the 
medium with the beam formed using P and S phase traces in seismograms registered by 
the local network stations. 

4) to investigate the character of variations of P-and S-phases apparent velocities in 
depending on the epicenter distance. 

We have analyzed the catalog of weak earthquakes containing arrival times for 4 
regional seismic phases: Pg, Sg, Pn and Sn, originated from the sources located in the 
region Galilee-Kinneret- Coast Tzor. The earthquakes were recorded by the stations of 
the Israel Local Network; their source parameters are presented in Table 1. The 
maximum distance between a source and a station was about 350 km. The catalogue 
contains for given sources the information about onset times of 313 Pg phases, 121 Sg 
phases, 262 Pn phases and 225 Sn phases. The sufficient amount of experimental data 
allowed us to obtain analytical equations for the travel time curves of these phases and 
to study the behavior of the phase apparent velocities against the distance from the 


source. 

Table 1. 


N 

Year 

Month 

Date 

Origin time 

Latitude 

Longitude Name of region 

1 

1987 

OCT 

7 

15:15: 4.7 

32.671 N 

35.260 E 

Galilee Reg 

2 

1988 

FEB 

24 

15:37:26.6 

32.717 N 

35.251 E 

Galilee Reg 

3 

1988 

MAR 

3 

13:15:25.4 

32.743 N 

35.258 E 

Galilee Reg 

4 

1988 

JUL 

29 

23:33:28.6 

32.664 N 

35.218 E 

Galilee Reg 

5 

1988 

AUG 

6 

14:49:27.5 

32.712 N 

35.142 E 

Galilee Reg 

7 

1989 

AUG 

19 

9:17: 5.4 

32.790 N 

35.283 E 

Galilee Reg 

8 

1990 

AUG 

11 

22:14:59.6 

32.987 N 

35.413 E 

Galilee Reg 

9 

1990 

AUG 

17 

9:30:32.1 

33.140 N 

35.262 E 

Tzor (Tyre) Reg 

10 

1990 

AUG 

21 

6:22:23.0 

32.663 N 

35.152 E 

Galilee Reg 

11 

1990 

SEP 

4 

16:44:21.1 

32.908 N 

35.035 E 

Galilee Reg 

12 

1990 

SEP 

16 

9:41:44.4 

32.961 N 

34.981 E 

Off Coast Haifa Reg 

13 

1990 

NOV 

17 

7:30:16.1 

32.792 N 

35.270 E 

Galilee Reg 

14 

1990 

DEC 

20 

0: 2:18.5 

32.547 N 

35.226 E 

Galilee Reg 

15 

1990 

DEC 

21 

15:24:50.4 

32.853 N 

35.571 E 

Kinneret Reg 

16 

1991 

JAN 

9 

2:30:42.2 

32.781 N 

35.272 E 

Galilee Reg 

17 

1991 

JAN 

26 

17:46:55.7 

32.782 N 

35.273 E 

Galilee Reg 

18 

1991 

JAN 

26 

19 3:24.0 

32.792 N 

35.296 E 

Galilee Reg 
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19 

1991 

JAN 

27 

3:5:37.2 

32.808 N 

35.324 

E 

Galilee Reg 

20 

1991 

FEB 

12 

8:32:56.9 

32.857 N 

35.468 

E 

Galilee Reg 

21 

1991 

FEB 

25 

6:33:55.4 

32.581 N 

35.306 

E 

Galilee Reg 

22 

1991 

APR 

5 

18:8:24.2 

33.084 N 

35.039 

E 

Tzor (Tyre) Reg 

23 

1991 

APR 

7 

17:18:19.7 

32.845 N 

35.584 

E 

Kinneret Reg 

24 

1991 

APR 

15 

1:21:28.3 

32.848 N 

35.594 

E 

Kinneret Reg 

25 

1991 

APR 

15 

5:3:49.8 

32.826 N 

35.574 

E 

Kinneret Reg 

26 

1991 

APR 

16 

6:38:1.3 

32.844 N 

35.587 

E 

Kinneret Reg 

27 

1991 

APR 

27 

7:13:22.3 

32.852 N 

35.580 

E 

Banneret Reg 

28 

1991 

MAY 

1 

14:36:22.4 

32.847 N 

35.580 

E 

Kinneret Reg 

29 

1991 

MAY 

1 

20:47:12.1 

32.835 N 

35.578 

E 

Kinneret Reg 

30 

1991 

MAY 

3 

22: 7:19.2 

32.830 N 

35.583 

E 

Kinneret Reg 

31 

1991 

MAY 

16 

2:50:17.2 

33.080 N 

34.981 

E 

Off Coast Tzor Reg 


We have used the RMS method of regression analysis for creating of a statistical 


model of phase travel time curves. We assumed that regression is not linear and time 
travel curves may be approximated by second-order polynomial: 

T(s)=b 0 + l\s+ &>s 2 , 

where T is the travel time, s is the distance from source to station, b—(bj y b 2 , bj)^ is a 
vector of regressions coefficients. To obtain the regression equation T—f(s) let us 
introduce the matrices 
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The results of obseivations form the column vector t and values of regression 
curve - vector T. Our task is to estimate the coefficients 6 0 ,6 l ,6 2 of column vector b. If 

to denote by X= S'S, then RMS estimations of b are got as the solution of matrix 

equation Xb^S^: 

b=X~ 1 S / t. 

The results of estimation of regression coefficient are presented in Table 2. 

Table 2. 

Values of coefficients for different phases 


Coefficients 

Pg 

Sg 

Pn 

Sn 

bO 

0.00047 

0.00219 

1.1745 

1.9951 

bl 

0.16137 

0.29682 

0.1671 

0.2900 

b2 

18*10-7 

16*10-7 

-0.0001 

-0.0001 
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For developing of our 2-order regression time travel cuives we assumed the 
following statistical model of estimated phase onset times 

tt = b 0 + ^ + b.sf + e t , 

where epicentral distances s are free of mistakes as and 6j are landom independent 
Gaussian variables with mean zeros and equal variances. Also we assumed that the 
random error of travel time estimation <?/ depends on accuracy of event origin time 
determination and accuracy of P- and S-phase arrival time estimation by a stations 
operator. This error is regarded as the same for all obseivations over networks stations: 

+ °" ’ 

where a, is the standard deviation of origin time, a t is the standard deviation of 

t o 1 

arrival time. 

Note, that the most correct approach consists in accounting for the fact that any 
source location method is based on minimizing the sum of residual times. So, ellipse of 
source coordinate errors is defined by the sum of squares of phase travel time residuals. 
In other words, both the origin time and arrival time errors are not independent. But 
this approach is rather complex one and we confine the analysis by the above 
assumption about independence of origin and arrival time errors. 

We have carried out the variance analysis of the regression to evaluate its 
significance and have applied the F-criteria for estimating the significance of every 
coefficient in the regression equation. The statistic of F-criteria is defined by the 

equation: 

__ 2 

b's , t-jvr 

where 7is mean value of the components of vector t 

The results of variance analysis for 2-order regression are shown in Table 3. 

Table 3. 


Phase type 

(b's'toio 4 

(t'toio 4 

JXl T > - 2103 

( *3 ,N- 

Pg 

2.706851 

2.706859 

0.0028 

3690 

Sg 

2.5100179 

2.5100181 

0.0044 

3613 

Pn 

16.80232 

16.8037 

0.2337 

3.931 

Sn 

48.91412 

48.9176 

0.3942 

4.685 
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One may infer that the regression has the high significance in regard of the table 
values of F-distribution even on the 2.5% level and thus regression model developed is 
adequate to real crust velocity features. 

The straightforward equations obtained for phase travel times against distances 
can be employed in a fast real-time algorithms for source location without loss of 
accuracy in comparison with more time consuming procedure of interpolating 
experimental travel time tables. 

For the initial fitting in these procedures the more precise first order regression 
also could be helpful. The regressions coefficients for 1-order regression are shown in 
Table 4. 

Table 4. 




Values of 

coefficients for different phases 

Coefficient 

Pg 

Sg 

Pn 

Sn 

bO 

-0.0001 

-0.0016 2.2169 

3.9868 

bl 

0.1614 

0.2971 

0.1442 

0.2488 

The results of dispersion 

analysis of 1 

-order regression coefficients are shown in 

Table 5. 









Table 

Phase type 

(tfs'tno 4 

(t't)lO 4 



Pg 

2.7068595 

2.7068597 

0.0028 

3660 

Sg 

2.5100179 

2.5100181 

0.0047 

3137 

Pn 

16.793 

16.8037 

0.6321 

0.54 

Sn 

48.889 

48.9176 

1.1312 

0.56 


The application of the F-criterion to the regression coefficients shows that on the 
5% confidence level the coefficients b 0 and b 1 are statistically significant for the 

all phase regression equations being analyzed. However on the 2.5%-level only the 
coefficients for Pg and Sg phases are significant. So for the Pn and Sn phases the 2- 
order regressions have to be adopted. Hence, the linear equations for the travel time 
against the distance with the coefficients given in Table 4 can be recommended for the 
Pg and Sg phases. 
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The linear and quadratic regression functions for the Pn and Sn phase travel time 
in depending on epicenter distance with the experimental phase onsets are depicted in 
Fig.6. We see that even visually the quadratic functions provide the much better fitting 
than linear ones with the sets of Pn and Sn arrival time observations. The linear 
regression functions for the Pg and Sg phase travel time in depending on epicenter 
distance with the experimental phase onsets are depicted in Fig.7. We see that for “g”- 
phases the anival time obseivatioxrs are in a good accordance with the linear regression 
curves. 

The estimates for variances of regression coefficients can be determined from the 
covariance matrix using the known equation: cov{b} - o 2 e X K For the linear regression 
model it implied the following equations: 

var{b 0 } = o 2 e(Xoo)' 1 var{bj} = o 2 e (Xu) 1 
The determined by this method values of standard deviation for linear regression 
coefficients are presented in Table 6. 

Table 6. 

Values of coefficient standard deviations for different phases 


Coefficient 

Pg 

s° 

Pn 

Sn 

bO 

0.19145 

0.2926 

0.1708 

0.195 

bl 

0.00695 

0.0121 

0.0028 

0.003 


Employment of the linear travel time equations at local distances provides the 
opportunity to simplify the minimum mean square source location algorithm while its 
application to local network data. This algorithm uses the linearization method for 
solving the inverse location problem so the assumption on the linearity of distance- 
velocity curves leads to decreasing of computing. 

The results discussed above concerned the developing of the relations between 
phase travel time and epicenter distance: T=f(s). The inverse relations (the distance 
against the travel time) gives one a impression about the variations of phase apparent 
velocities along the horizontal distance of phase propagation. For this reason we 
developed also the similar quadratic and linear regression equations for the relation 
s=f(T). The coefficients of the linear regressions for the all regional phases are shown in 
Table 7. The plots of horizontal distance of the Pn and Sn propagation paths in depend 
of travel time with the quadratic regression fitting curves are presented in Fig. 8. The 
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analogous plots for Pg and Sg phases with the linear regression curves are shown in Fig. 
9. 


Table 7. 


Values of coefficients for different phases 


Coefficient 

Pg 

Sg 

Pn 

Sn 

bO 

0.0008 

0.0056 

-15.0643 

-15.6602 

bl 

6.1959 

3.3668 

6.9221 

4.0102 
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Fig. 6. Linear and quadratic regressions 
for Pn and Sn phase travel time 
in dependent on epicenter distance. 
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Fig. 7. Linear regressions 
for Pg and Sg phase travel time 
in dependent on epicenter distance. 
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Fig. 8. Horizontal distance of Pn and Sn propagation paths 

in dependent on travel time. 



























Fig. 9. Horizontal distance of Pg and Sg pr 

in dependent on travel time. 
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3. ADAPTIVE PROCESSING OF 3-COMPONENT SMALL APERTURE 

AND MICRO-ARRAY DATA 
(THEORY AND ALGORITHMS) 


3.1. Mathematical models of 3-component small aperture seismic array records 

Let us suppose hereinafter that seismic waves are generated by remote seismic source and 

each of the seismic phases ( P, S, L, R, etc.) is a plain wave. It is supposed as well that the body 

wave phases arrive from the homogeneous lower half space on a surface batch of laterally 

T 

homogeneous layers in accordance with unit steering vectors a w —(a wx ,a w y,a wz ) , where w 

represents the wave-type index. We designate as v w a velocity of a w-type wave in the half space 

directly beneath the batch of layers. Let us assume the origin of coordinates to be settled on the 
surface of the half space, with the Z-axis directed down, 7-axis to the North and X-axis to the 
East; the wave azimuth a be counted clockwise from the positive direction of the 7 axis, and 
the wave incidence angle - fom the positive direction of the Z axis. Then the vector a w can 

be written as a w —(sina sinfi w , cosa sin$ w , cos$ w ) ( T IS the s i8 n of transposition). For surface 
waves =k/2 and cosfi w = 0 . 

T 

A medium displacement w(t,r) =(w x (t,r) , Wy(t,r),w z (t,r)) for a particular seismic phase at 

T 

an arbitrary point r~(r x ,Vy f r z ) of the homogeneous half space can be expressed as follows: 

w(t,r) =s w (t - (r T a w )/V w )b w (1) 

T 

where s w (t) is a waveform of a seismic phase at the origin of coordinates, b=(b wx ,b w y,b wz ) - is 

a unit vector of seismic phase oscillations which is determined by a wave the incidence angle p, 
azimuth a and the model of transformations of the plain wave at the day suxface boundary. For 
the simplest medium model without consideration of affecting of the day surface on the wave 
field the vector b is expressed by the following simple geometric equations [6]: 

sin a sin (3 

for P-waves b „ = cos a sin P ; (2a) 

cos P 

cosa 

for SH-waves and for Love waves bL = sin a ; (2b) 

0 
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for SV-waves 


for Rayleigh waves 



sin a cos (5 
cos a cos p ; 
sin p 



/ sin a sin P 
/ cos a sin p ; 
cosy 



(2d) 


where y = arctg (e) , e represents the ellipticity of a Rayleigh wave, e.g. the ratio of the small 
axis of a polarization ellipse to the large one; i=4-l characterises the phase shift of n/2 between 
the vertical and horizontal components of Rayleigh wave displacements. 

In the frequency domain eq.(l) has the form 

w(f,r) =s w (f) exp [42k f(r T a w )/V w ] b w (3) 

where s w (f) is the complex spectrum of phase waveform. Let us introduce the 3- 

dimensional phase slowness vector : q w —(Px> Py> P'y ~ a /^w Then the wave field eq.(2) can be 
expressed as a function of the vector q w as follows: 

w(f,r) =s w (f) exp [42k f(r T q w )] b( q Wy VJ (4) 


Dependence of the vector b from the vector q w in eq.(4) is based on the following simple 
geometric relations: 

2 2 - 1/2 

sina — P x /p h ; cosa=p y /p h ; sin$ w =p h V w ; p h =(p X +P y ) ■ (5) 


Equations (2) and (5) imply that for simplest media model the vector b depends only on the 
values of apparent velocities p x , p y , ph and phase velocities V p , V s via the following equations: 


for P-waves 



for SH-waves and Love waves 





for SV - waves 


(6c) 
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for Rayleigh waves 



-/ ctgy 


. ip h 1 sin \|/ 




Note that for Rayleigh and Love waves values of p x and p y depend on frequency/ 

The model above is the simplest in the sense that it does not take into account the 
reflections and transformations of the different wave phases while arriving to the day surface. 
The more complex but more realistic model of seismic wave propagation in a vicinity of the day 
surface is proposed by B.Kennett [9]. From this model one obtains the following equations for 
vector b : 


for P-wave: 



-sina -V P p h C 2 
-cosa • V p p h C 2 
VpQ p C l 



for SH and Love waves: b L 


2 cos a 
-2 sin a 
0 



for SV-wave: 


for Rayleigh wave: 



sin a • V s q s C l 
cosa • V s q s C x 
V s P h C 2 



-i sin a • sinv|/ 
-/'cosa • siny 

COS\|/ 


(6c*) 


(6d*) 


where V p , V s are phase velocities of the P and S waves correspondingly; sina=p x /pp, 

2 2 - 1/2 

cosa—py/pfp ph is the horizontal apparent slowness of the wave phase: pp =(p x +p y ) 

<] P = (Vp 2 - P 2 h ) m ; q s = (Vs 2 - pl ) u2 ; 

„ _ 2 • V- 2 • (Vs 2 - 2- pi) . ^ _ 4 • Vs 2 • q P -q s 

1 (Vs 2 - 2 • pi) 2 + 4- p 2 h -q P - q s ’ 2 (T/ 2 - 2 • p 2 h ) 2 + 4 • p 2 h ■ q P ■ q s 


If one neglects the effects of wave reflection from inner medium layer borders and 
conversion of wave types on these borders while plane wave propagates through layered medium, 
and restrict himself to taking into account only the effects of wave refraction (the ray 
propagation approximation), then according to tire Snellius law [1] a value of pp=s'm$ W /V W at 


any point within a batch of layers (and beneath the day surface) is constant and equal to its 
value in the half space. As p x —sina. pp and p y —cosa. pp, a value of 2-dimensional vector 
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p=(p Xf Py) T of horizontal apparent slowness is also constant in all medium points. This vector 

value is a ray parameter and determines a propagation path of a seismic ray in such layered 
medium [1]. Based upon eq.(5) and eq.(6) one can come to conclusion that a wave seismic field 
in any point r of a laterally homogeneous medium is determined only by the vector p and a 
wave velocity V w in that layer where the point r is located and is independent from other 

parameters of layers. Thus, eq.(4) initially written for the homogeneous space is valid for any 
point of a laterally homogeneous (layered) medium and in particular for points just beneath the 
day surface, if to neglect a reflection from the day surface. 

Considering eq.(4) enables us to interpret the field w(f,r) on the day surface, e.g. for 
r—u=(x, y, 0) as a result of a signal s w (f) propagation through a linear system: 

w(f,u) = g w (f,u,p)s w (0. (7) 

The frequency response of this system (in a case of laterally homogeneous medium) is 

gw(f> u ’P) = exp[-i2nf(u r p)J b(p, V w ) (8) 

If a batch of layers is a fine-stratified and contrasting enough so it is impossible to neglect 
the effects of wave reflection and wave type transformation, then a field w(f,u) on the day 
surface admits yet the representation by eq.(7). However, in this case the frequency response of 
the corresponding linear system can not be expressed by such simple formula as eq.(8). 
Nevertheless for an arbitrary laterally homogeneous batch it can be determined with the help of 
rather effective computational methods like the ones of Thompson-Haskel or Kennett [10]. 

Let us consider a system of seismic observations consisting of m 3-component 
seismometers registering displacements of medium particles and located at the day surface at 
points Uj. If distances between points «/ are rather small so the medium can be considered 

permanent within the aperture of the system then such system of seismic observations can be 
referred as a 3-component array. For this case the set of complex spectra of seismometer outputs 

- a 3m -dimensional column vector y w (f)—(ywi(f)> 1,3m) - may be expressed as 

y w (f) = h w (f,p) s w (f) (9) 

where h w (f, p ) =(g w (f, Uj,p), i- 1,3 m ) is a Jm-dimensional column vector of frequency 

responses of linear systems that are determined by propagation paths of a plain wave from the 
lower half space to seismometers outputs. In particular, for ray approximation one can write 

h w (f,P, V) = (exp[-i2%f (u 1 / p)] b w (p, V), /=!, in) (10) 
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The considerations made before allow to conclude that in the framework of ray 
propagation approximation without any loss of generality the origin of coordinates may be 
placed at Uj - a point of location of the central array sensor. Then s w (f) can be interpreted as 

the waveform of a seismic phase at the central sensor. 

Eq.(9), (10) demonstrate that in the framework of ray propagation approach signals from 
an array of 3-component seismometers contain information on seismic phase apparent slowness 
vector p , wave velocity V w in the layer beneath the surface and phase waveform s w (f). In seismic 

monitoring estimation of the apparent slowness vector p appears to be crucial for effective 
location of seismic event epicentres using data from single observational site. Determination of 
seismic phase waveforms s w (f) is necessaiy for identification of a seismic source type and 

estimation of a source seismic moment tensor. Estimation of wave velocity is important for 

investigation of regional medium structure on the basis of seismic data. Usually these problems 
are solved sequentially: after a vector p value is determined a phase velocity V w can be estimated 

from polarisation characteristics of three component obseivations based on eq.(6). And at last, if 
p and are known there appear the best conditions for retrieving a seismic phase waveform 

s w O) (or s w (t)) from 3-component multidimensional obseivations (using eq.(9), (10)), e.g. 

extracting a function s w (f) from a noise background. The latter is quite substantial for seismic 

source identification of small events. 

Information on apparent slowness vector p is contained in relative time delays 
x k = ( u k~ u j) T P sensor signals and in polarisation characteristics of each 3-component 

seismometer output (mathematically expressed by the vectors b^{p,V)). It is important that 
relative delays do not depend on velocity V w of a seismic wave phase in the layer where 

seismometers are placed. In the contrast, the polarisation characteristics are strongly dependent 
on V w . The upper subsurface layer in many regions has small phase velocities, often known with 

a high uncertainty. In this cases angles of incidence (3 W of body seismic waves to the day surface 

are small even for regional events. As the result the accuracy of measurements based on 

polarisation characteristics is rather poor [23]. In this conditions more accurate methods to 
measure apparent slowness p are those based on time delays T£, for example, methods of the 

spatial spectral analysis (F-K analysis). 

Nevertheless, 3-component array obseivations submit some information about wave 
azimuth which is additional to the relative time delays T£. and also do not depend on wave 

velocity V w in the surface layer. Really, as it follows from eq.(3) and eq.(6) an amplitude ratio 
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between signals of seismometer horizontal components is determined only by a seismic wave 
azimuth. 

Hence, with the puipose to extract all information about apparent slowness vector p, 
contained in 3-component array observations, one should combine the both: polarisation 
analysis and F-K analysis approaches, in a single 3C-data processing algorithm. Some of such 
algorithms are described in Section 3. 

Signals of seismometers are usually being obseived against a noise background \(t) = 
fcki(t), j—1,3). Therefore output signals of a multichannel system of seismic 

observations compose a multidimensional random time series 

x w (t) = y w (t) +%(t)= h w (t,p, V) * s w (t) + %(t) (11) 

where h w (t,p,V) is a vector impulse response of a medium for given wave phase equal to 
Fourier transform of a vector-function (10), * is the sign of convolution. In the frequency 
domain eq.(14) can be written as 

x w (f) = h w (f,p, V) s w (f) + UV ( 12 ) 

where E, (f) is a column 3m dimensional vector of a noise complex spectrum. 

The noise presence in observations results in the necessity to take into account distorting 
effects of noise while resolving the discussed above problems of seismic phase parameter 
determination. E.g. these problems should be formulated as problems of statistical time series 
analysis. Theoiy of optimal statistical inferences [3,7] should be implemented for their solution. 
It meets some difficulties because this theoiy has been mostly developed for case of known 
statistical characteristics of noise. For seismic noise it is usually not so. 

Numerous investigations have brought enough proves for seismic noise \(t) to be 
considered as a Gaussian process, which may be assumed as a stationary one (at time intervals 
typical for seismic signal duration) with zero mean and the smooth matrix power spectrum 

density (MPSD) F(f) = Efc(f)% T (f)}. Here E represents the sign of mathematical expectation. A 
MPSD F(f) reflects seismic noise space and time correlations, which are as a rule non-zero and 
mostly developed in a low-frequency band. It is especially typical for seismic noise in regions 
close to sea and ocean shores where seismic noise are generated mostly by a surf. In such 
regions the space correlation of noise (coherence of noise) is so high that for frequencies lowei 
than 1-3 Hz the MPSD F(f) of a noise vector time series %(t) registered by a small aperture 
seismic array is nearby to singular [5,18]. The proximity of a determinant of F(f) to zero foims 
the criteria for noise coherency in regard to a given observational system. 
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As it follows from the general points of mathematical statistics if a determinant of F(f) is 
close to zero the probabilities of errors for statistically optimal decision rules are close to zero as 
well. Note that for non-optimal procedures, it is by no means necessaiy. Hence it seems to be 
rather desirable to take into account an information on noise MPSD F(f) for synthesis of 
statistically optimal algorithms for processing of multicomponent seismic observations. It often 
provides a significant increase in accuracy of seismic signals parameter estimation due to the 
effect of "compensation" (suppression) of noise by statistically optimal data processing 
procedures . 

In order to achieve the theoretical efficiency of noise suppression the strict information 
about its MPSD F(f) is demanded. But for long time intervals a noise time series \(t) can not be 
regarded as a multidimensional stationary process: its MPSD F(J% being defined by current 
characteristics of a seismic noise field, is changing in time. Therefore the statistically optimal 
procedures of multicomponent seismic data analysis can be efficient only in the framework of 
adaptive approach where the noise MPSD is continuously or periodically estimated using 
current noise observations. 

3.2. Coherent noise suppression and seismic waveform extraction 

using data from 3-component arrays 

3.2.1. Introduction 

The adaptive group filtering technique developed and tested in the previous research 
stages can be gained for processing data from 3-component small aperture and micro arrays. 
The latter is the subject of the primary interest as Alpha-stations of the International Monitoring 
Network being developed for verification of Test Ban Treaties. The usage of 3-component array 
data allows to significantly enhance the quality of extraction of seismic phase waveforms from 
background noise especially if to employ the adaptive optimal group filtering (AOGF) method. 
The possibilities are opened to extract waveforms of different event phases which are 
characterised by different polarisation features, including some of them which are impossible to 
be handled using only 1-component array data: regional SH, teleseismic Love and so on. 

The seismic noise (especially in the sea-shore regions) often is the transient one and is 
constituted by the surface waves generated by a surf or industrial sources. In these cases it 
exhibits explicit coherent and polarisation features. As shown, the coherency of noise can be 
successfully utilised by the AOGF method for the case of 1-component array data processing. 
This method can be expanded for 3-component (3C) array data processing. The (3m x 3m) 
matrix power spectral density (MPSD) of 3C noise records have to be estimated in this case and 
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the polarisation characteristics of noise are automatically captured in this MPSD. To adjust 
correctly the AOGF for an extraction of waveform of given seismic phase one have to account 
the polarisation feature of this phase at the site of 3C observations. So different phases should be 
treated by the different AOG filters. Parallel processing of the same 3-component array 
recording by these filters produces three output traces reflecting wavetrain oscillations in 
longitudinal, transverse and vertical direction generated by the P, SH and SV body waves, Lg, 
Rayleigh and Love surface waves. Such analysis can be very helpful for investigation of complex 
wave-field in the regions with strong laterally heterogeneous media structure. It can be 
implemented for enhancing of event source location and identification quality in regional 
monitoring with the help of a 3-component array. 

Some complication of computing while the processing is justified by the advantages of 
the described combined procedure accounting for differences in both: relative delays and the 
polarisation characteristics of the array signal and noise. The application of this procedure can 
provide for our assessment, the same quality of signal extraction using data of 3C micro array, 
as using data of 1C small aperture array with 2-3 times larger amount of sensors. 

The testing of this technique was accomplished using seismograms of the Geyocha 3C 
array temporary deployed within the framework of PASS CAL Project in Turkmenistan near 
Ashgabad town and was operated during 1993-1994. The major interest of this PASSCAL 
experiment in regard of our study purposes is that the Geyocha array contained the 3C small 
aperture subarray composed by 12 3C very broad band STS-2 seismometers which have 
registered the wave-fields from a set of local, regional and teleseismic events in extremely wide 
frequency range 0,002-10 Hz. This open a possibility to extract the waveforms of body-wave and 
surface-wave event phases from the single multichannel array seismogram and thus eliminate 
interference usually introduced by frequency responses of different seismometers registering event 
signals in the distinct bands of seismic range. This array is interesting also as an example of 
seismic installation deployed in a thick sedimentary basin with strong laterally heterogeneous 
media characteristics. 


3.2.2. Statistically optimal group filter (Wiener filter) 

According to eq.(l.ll), to restore a waveform s(t) from a seismic data x(t) recorded by a 3- 
component (3C) seismic array, one should make up for a distorting effect of the medium and 
clear the signal off the noise. As it follows from statistical theory of time series analysis, if 
distribution of noises 'f) is Gaussian the best procedure for restoring the function s(f) is a group 

A 

filtering, according to which the scalar complex estimation s (J) of a waveform s(f) is found as 


follows: 
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s(f) =$*(/) x(f) (1) 

where <j >(f) = (§jk(D> J~ ^1,3) is a 3m-component vector frequency response of the 
group filter, symbol * designate the Hermitian conjugation. 

The problem of statistically optimal estimation of a waveform s(f) (the optimal group filtering) 

% 

consists in seeking for a frequency response <)> Jj) which minimises the square mean deviation 

E s {[s(f) - s(f)] 2 } = min (2) 

under the condition of unbiased signal estimation: 

E s {s(f)}=s(f) (3) 

In the above equations E s is the sign of conditional mathematical expectation under the 

condition that signal is given; the eq.(2),(3) have to be valid for each frequency /for any signal 
realisation. The requirements by eq.(2),(3) are equivalent to the requirement on the signal 
estimate dispersion to be minimal: 

D s {s(f)} = min (4) 

The minimisation of dispersion of signal estimate eq.(4) under condition eq.(3) results in 
the following expression for the frequency response of the OGF [4,5,16,27]: 

<f> * 0 (f) = [h *w(f>p> V)F A (f)]/ [h * w (f,P, V) F - l (f) h w (f,p, V)J (6) 

where rhv is a inverse [3mx3m] - matrix spectral density of 3C array noise %(t); 

F(f)=Emvm a 3m-vector function h w (f,p, V) is the frequency response of a medium for 

the given seismic phase w propagation paths from the first 3C array sensor to another ones. For 
the ray approximation of seismic wave propagation h w (f,p, V) is expressed by eq.(l.lO). 

The OGF output signal in the no noise condition coincides with a phase waveform Sy/D- 

Indeed, in agree with eq.(l) and (1.9), if a noise is absent fc(t) = 0) 

s(f)={[h * w (f,P, V) F- ] (f)J / [h * w (f,p, V) F' 1 (f) h w (f,p, V)]} h w (f,p, V) s w (f) = s w (f). (7) 

If a noise consists of only a diffusion seismic noise i.e. noise records \j(t) at all 3C array 

-1 -2 

sensor components are non-correlated, then F (f) — o D(f) e.g. is a diagonal matrix. In this 
case, it is often (and groundlessly) assumed that a diffusion noise is white: D(t) — /. Then the 
optimal group filtering procedure coincides with the classical 1C array beamforming (BF) 
























10 


procedure applied to 1C seismograms, recorded after rotating 1C sensor to a direction of particle 
motion for a seismic phase being processed. To perform the last procedure one should previously 
determine a type of the phase. 

The optimal group filter eq.(6) does not distort a signal waveform and provides a 
maximum signal-to noise ratio only if a vector-function h w (f,p, V) used well corresponds to the 

true frequency response of the medium for the signal propagation paths. For a plain wave and 
the ray propagation approach it is to have a form given by eq.(l.lO), where the 3C vector 
function byfp,V) corresponds to a phase type (eq.(1.6)), vector p coincides with the true 
apparent slowness vector of the signal wave and V- with the true wave velocity of the seismic 
phase in the medium subsurface layer. The last values should be preliminary estimated (with the 
help of procedures described in Section 3 below). An other possible approach consists in 
scanning the expected signal arrival directions (and/or wave velocities) with the help of a "fan" 
of filters eq.(6) with different functions h w (f,p,V). The values of p and v on which the 

maximum power of some filter output is attained can serve as the seismic wave parameter 
estimates. The OGF output signal for this direction is an estimate of a signal waveform. 

It is easy to show, that a group filter with the following frequency response 

(t> * w (f) = [h* w (fp, V)F~ X (f) ] / [h * w (fp, V) F' l (f) h w (f,p,V) ]V2 (8) 

also suppresses a coherent component of the noise with the MPSD F(j) but provides a whitened 
residual noise output r| (t) = <t> * w (f)\0), for which Er[(t)r\ *(t)=I, where / is identical matrix. 

However, this group filter distorts the signal waveform. We call it as whitening group filter 
(WGF). As it is shown in Section 3, WGF procedure is a general intermediate procedure for a 
statistical estimation of seismic wave apparent velocity. 

3.2.3 Adaptation of optimal group filter to variations of noise spectrum. 

If the [3mx3m] matrix function F(f), used for calculation of the OGF frequency response 
via equation (6) well approximates the real MPSD of current array noise then the OGF provides 
an output with significantly greater SNR then conventional beamforming does. Theoretically the 
OGF SNR gain tends to infinity if a determinant of F(f) tends to zero. This occurs in the case 
of purely coherent noise. The determinant of F(f) thus can serves as a measure of noise 
coherency and indicates situations where implementation of the OGF would be successful 

As seismic noise is usually not stationary and its MPSD varies in time, the OGF method 
can be effective only in an adaptive processing system, where a MPSD F(f) is periodically 
estimated (updated) using current noise observations. We call such array data processing 
procedure as the adaptive optimal group filtering (AOGF). 






















We have found that the AOGF procedure with group filter frequency response by eq.(6) in 
most cases has a better performance than the conventional beamforming (BF) procedure if a 
noise MPSD is estimated with the multidimensional autoregressive moving average (ARMA) 
modelling of noise records [16]. It means that an inverse MPSD is evaluated in the form of the 
matrix rational function [7]: 


F(f)= (2 A e <2 L > e V <2 (9) 

k =0 l~-q k= 0 


where A is the data sampling interval, [3mx 3m]-matrices A& keO,p , are determined using first 
p+1 sample matrix autocorrelations of noise obseivations with the help of a computationally 
effective multichannel version of the Levinson-Durbin procedure [16,19,28]. MA-coefficients: 
[3m, 3m]-matrices Z/ are calculated as weighted autcovariance matrices with lags le -q,q for the 
time series which is produced from noise data by the multichannel whitening filtering using 
matrix AR-coefficients Ajc 

In any case where a MPSD of noise F(f) have to be determined from observations, the 
following two situations can be faced: 

a) 3C array records comprises an interval where only "pure" noise is present; 

b) all observations are a mixture of signal and noise. 

In the first case a rather precise estimate of a noise MPSD F(f) can be obtained from the 
observations of a "pure" noise. In the second case it seems at a first glance impossible to get a 
good estimate of F(f) since the MPSD of observations is equal 


p xw (f) =Fw+ h w m 


( 10 ) 


where \x(f) is a power spectral density of a seismic phase waveform. Just the function eq.(10) 
will be really estimated from a signal and noise mixture. The estimate of Fxw(f) being obtained 

in this case can be quite different from the noise MPSD F(f), especially if a signal-to-noise ratio 
is large enough. 

The following property of the OGF by eq.(6) (which can be called as the "adaptation 
stability") presents a some theoretical advantage of the OGF: it is easy to show [12,15] that 
substitution of F xw (f) (eq.(10)) in eq.(6) instead of F(f) does not change the value of the OGF 

frequency response § 0 (j). This theoretical advantage is however practically meaningful if an 

estimate of is accurate enough. 
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3,2.4. Spatial Rejecting Group Filter 

The best noise suppression using group filtering procedures can be achieved if noise is 
close to coherent. This property possesses transient noise generated by one or several souices 
localised in a medium. Real seismic noise consists normally of such noise supeiposing with 
diffusion microseismic noise caused by a huge number of independent sources. I.e. a real noise 
process at 3C array sensor outputs can be written down as follows: 

%(t) = X <lk( t )%k( t ) + 5 (U ( n ) 

k=l 

where C,k(t) is a scalar noise process in k- th source of coherent noise; qj^(t) is a J/w-dimensional 

vector of impulse transfer functions for the noise along propagation paths through the medium 
from k- th noise source to 3C array sensors; s is a number of coherent noise sources; 8 (t) - is a 
vector of diffusion noise processes, * is the sign of convolution. 

If processes ^(t), k=l~s and 5 (t) are mutually non-correlated, a matrix power spectral 

density of the process ^(t) has the following form [16,17]: 

F(f) =X dkCDhO) i**0) + « D (f) =xjf) + <y 2 do); (n) 

k =1 


where 


R s = QsCO a sO); QsO) = [^(O.-AsWJ; 


'Ji(f) ■ 


A s(f ,) 


, h(f) 


where JfcO) - is a power spectral density of process -(1); DO) - is a matrix power spectral 

density of diffusion noise 8 (t); o 2 - is a diffusion noise power (for simplicity's sake we consider it 
to be equal for each array sensor); QkW ~ * s a 3m -dimensional vector frequency response of the 

medium at propagation paths from &-th source of coherent noise to 3C array sensors.. 

If the diffusion noise is absent, i.e. — 0 and a number of coherent noise sources is fewer 
than that of array sensors (in our case, s < 3m ) the matrix FO) become singular. It is shown 

[12,17] that if diffusion noise power tends to zero (c^->0) 

§*r(0 = ] ,m )o(f) = [h \{f,P, V)Bs 0)1/h V)B S 0) Klf,P, V], (1 ■3) 

a —>0 



where B s = [J - R/R* S R) } R* S ], I - is a [sx s]-unit matrix. 

In the most significant particular cases, where only one source of coherent noise exists (s—1) 
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B^f)=[I-q(f)q*(f)/\q(f)\ 2 ]- (14) 

In the absence of diffusion noise, the noise at the output of group filter with frequency response 
by eq.(13) is equal to zero: 

= (15) 

where t,(f) = (Q^(J), k=l,s) is a vector of processes in the noise sources. Eq.(13) results from the 
equation: B/f) Q/f) = 0, which is veiy easy to prove for s = 1: 

Bft) Qj(f) =[q(f) - q(f)q*(f) q(f)/ \q(f)\ 2 ] = 0 (16) 

Thus, in the absence of diffusion noise, i.e. when seismic noise is fully coherent, it will be 
completely suppressed by the group filter with frequency response by eq.(13). We call this group 
filter as the ‘spatial rejection group filter’(SRGF). 

Note that coherent noise suppression with the help of the SRGF by eq.(13) calls for 
information on the frequency response vectors qj^Cf) of noise propagation paths in the medium. 

This causes serious problems for its practical implementation. 

In some cases a coherent noise can be considered as caused by some surface wave (k—1). 
In such case the vector qft)) can be found from eq.(l.lO) where p is an apparent slowness 

vector of coherent noise wave, b w (p, V) are determined by the type of a noise wave with the help 

of eq.(1.6b) or eq.(1.6d). The apparent slowness vector p of such interfering noise wave can be 
estimated with the help of F-K analysis of array records made at a time interval where only 
coherent noise is present. The another variant of the adaptive SRGF can be obtained if one uses 
as the of vectors qfc(f) estimates the principal eigen vectors of a MPSD of coherent noise 

records. The evaluation of the MPSD is preferably to be done by multidimensional ARMA 
modelling of the noise time series. 

Nevertheless the design of spatial rejection filter using eq.(13), (14) always suffers from 
some uncertainty with respect to a number of coherent noise waves, their types and parameters 
p, V w . The experimental investigations of such filters in case of NORESS type small aperture 

seismic arrays showed that the zones of noise suppression over the apparent slowness plane 
(Px’Py) are ra ther narrow for any SRJF (although are deep enough). Therefore, a SRGF 

provides a strong suppression of coherent noise only if medium frequency response vectors qfffl 

used for the noise wave characterisation precisely correspond to reality: one should know 
accurately the types of noise wave phases, apparent velocities of these waves and be sure that the 
waves are really dose to plain. Naturally, all these conditions are seldom met in practice. 
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At the same time, if one uses for coherent noise suppression the adaptive optimal group 
filter in its canonical version eq.(6), this calls for the estimation of matrix power spectral density 
(MPSD) F(f) of array sensor noise, irrespectively of a physical essence of the noise origin. It is 
obvious that in practice there is no ideally coherent noise and F(f) is always non singular 
(although it may be poorly conditioned). Therefore, in practice the suppression of coherent 
noise with the help of the AOGF is computationally correct procedure (provided the 
calculations are accurate enough, that is quite feasible task for present-day computers). 

3.2.5. Optimal group filter with additional constraints 

As it was discusses in Section 3.2.2, in the case of known MPSD of array noise the 
statistically optimal (Wiener) group filter (OGF), not distorting a phase waveform frequency 
content have to has the vector frequency response given by eq.(6). The filter minimises the 
output residual noise if the matrix function F ~^(f) used in eq.(6) corresponds to the inverse 
MPSD of real noise at the array sensors. To ensure this correspondence the adaptation of OGF 
should be currently or periodically provided, i.e. the adaptive optimal group filter (AOGF) 
should be implemented in seismological practice. 

Our studies of AOGF spatial sensitivity diagrams (SSD) for different array configurations 
and different noise conditions revealed that maps of these diagrams (depending on apparent 
slowness vectors (ASV) of seismic wave arrival directions) has deep minima to suppress coherent 
noise waves and always has the value equal to 1 at the ASVo-point corresponding to expected 
signal arrival direction (on which the AOGF have been steered). These AOGF SSD features are 
implied by the assumptions eq.(2)-(4) made in the procedure of AOGF design. However the 
AOGF SSD as a rule has a rather high maximum exceeding 1 and positioned aside of the ASVq 
point. This can lead to deterioration of AOGF noise suppression capability if the coherent noise 
features (e.g. its arrival direction) abruptly change during AOGF performance before the 
consequent cycle of adaptation. 

The some methods were proposed to improve this AOGF potential disadvantage. One of 
them was discussed in [5] and consists in applying additional constrains in the procedure of 
synthesis of AOGF frequency response. The constraint (3) to pass the signal undistorted can be 
formulated as 

<t> *(f) h w (f,p, V) -7. (17) 

It is possible to apply the general form linear constraint: 

<J> *(f )H w (f,p, V) =a, (18) 

where H w (f,p,V) is some matrix, depending on fp,V and the phase type w; a is come 
constant vector. Under constraint (18) the optimal group filter design is performed by deriving 
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analytically such fitter vector frequency response which provides the minimum of the filter 
output power 

<0 '(f) = arg min ft *(f)F x (f)W)i, (19) 

¥1) 

where Fyff) is the matrix spectral density of the signal + noise array observations. 

The vector group filter frequency response satisfying conditions by eq.(18), (19) can be 
found by the method of Lagraunge factors which implies the minimisation of the Lagraunge 
functional 

A = **(f)F x m<f) - *(f)H w (f,p,V) - a) - ft *(f)H w (f,p,V) - a)l, (20) 

with subsequent determination of the Lagraunge factor vector X from constraints by eq.(18). 

This method provides the following equation for vector frequency response of the constrained 
optimal group filter: 

<t> *oc(f)=F- J x (f)H w (f,p, V)[H* w (f,p, V)F -l x (f)H w (fp, V)]' 1 * (2D 

For the purpose to design the robust AOGF which would be less sensitive to changes of 
the supposed (or previously estimated) noise MPSD, the following constraints are relevant [5]: 

HJfp, V) = [h w (f,p 0 , V)\^~ h w (f,po, V)\^~ h w (f,p 0 , V)]; a = (1,0,0) T (22) 

OpX OPy 

The constraints defined by eq.(22) guarantee that the AOGF with the frequency response (21) 
will not possess a high “side lobes” in a vicinity of the arrival direction with ASV po (at which 
the AOGF is steered). This AOGF peculiarity can be valuable in the case of strongly 
nonstationary noise conditions. 

Note that the constrained AOGF by eq.(21),(22) has the very valuable feature similar to 
one for the conventional Wiener AOGF: the vector function <(> *oc(f) does not change if to 
substitute in to (21) instead the inverse MPSD F ~ J x (f) °f signal 4 - noise observations the inverse 
MPSD (f) corresponding to the “pure” noise . This allows to perform the AOGF adaptation 
using the noise recordings at time intervals, preceding or succeeding the signal phase intervals. 
The practice showed that in real AOGF implementations this leads to higher noise suppression 
capability. 
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3.3. Adaptive statistical algorithms for estimation of azimuth and apparent velocity of seismic 

phases using data from 3-component arrays 

3.3. 1. Introduction. 

For estimation of arrival direction (AD) parameters: azimuth and apparent velocity, of 
seismic phases the two conventional techniques are currently used: polarisation analysis of 3- 
component (3C) records from a single seismic station and F-K analysis of data from 1- 
component seismic arrays. The different modifications of these techniques have been developed. 
Some of them, as the high resolution F-K analysis on the basis of ARMA modelling of 1- 
component array signals, were investigated in [12,16]. 

The wide use of 3C small aperture and micro arrays compels seismologists to develop the 
new methods of azimuth (AZ) and apparent velocity (APV) estimation, that extracts all 
information about this parameters contained in 3C array records in both: the relative time 
delays of array signals and their polarisation characteristics. The theoretical background for 
statistically optimal estimation of AD parameters using 3C array observations obscured by 
coherent noise was investigated in [11,13]. Three algorithms are proposed which are optimal 
under different assumptions about a phase waveform. The peculiarity of these algorithms is that 
the parameters AZ and APV have to be estimated simultaneously with the wave velocity (WV) 
of given seismic phase in the medium beneath the array (if this velocity is unknown). 

The theoretical assessments and computer simulations reveal that significant 
enhancement of estimation accuracy can be achieved in the case where this algorithms are 
applied to micro array data obscured by intensive coherent noise. Nevertheless the extensive 
experiments have to be made with real 3C array records to prove an expediency of practical 
application of these algorithms. 

In this section the evaluation of azimuth and apparent velocity of a plane wave using data 
from 3 -component small aperture seismic array is treated as a statistical problem of estimation of 
multidimensional stochastic time series parameters in condition where these parameters 
comprize the informative and nuisance ones. This approach is new and distinct from the 
conventional one, according to which the evaluation of arrival directions is interpreted in the 
framework of spatial spectral (F-K) analysis [24,26]. 
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3.3.2. Mathematical models of observations as a random time series 

with informative and nuisance parameters. 

The mathematical model of seismic signals and noise in array sensors was discussed in 
Section L Let us remember that in the time domain this model is 

x(t) = y(t) + %(t) = K(f,p, V)s w (f) + %(t) (l) 

where the notations are the same as in Section 1. 

In this section we use to make two alternative assumptions about s(t) - time function 
(waveform) of seismic phase: 

a) A signal sft) is a realisation of a Gaussian stationary random time series with zero mean 
and a power spectral density \i/f). 

Though this assumption seems to be artificial from the point of view of seismological practice, 
actually it means that during a synthesis and analysis of estimation algorithms we confine 
ourselves by taking into account only an averaged signal power spectrum without any 
consideration of signal phase spectrum. From this point of view the assumption a) is a mere a 
proper way to enable us to use the technique of statistical time series analysis [12,17] 

bl ,rfl)is an unknown deterministic time series. 

In seismological practice the problem of seismic wave AD estimation has often to be resolved 
without any knowledge of a waveform s(t). The reason is that the seismic event waveforms (and 
their power spectra) strongly vaiy for one case or another. 

With the purpose to estimate the informative parameters p~(p x >Py) of observations fitting 
eq.(l.l) in the conditions of a‘priori uncertainty of a signal waveform we have to introduce 
unknown nuisance parameters of the signal . A seismic wave velocity in the surface layer of the 
Earth crust is often also unknown and for 3C array data analysis one has to regard it as an 
additional nuisance parameter. 

When modelling s(t) as a random Gaussian time series let us suppose that its power spectral 
density is a known function (p (f,c) depending on q nuisance parameters c=(cj,...,Cg) T . The linear 
model seems to be the simplest in this case: 

=X c k q k (f) (2) 

k=0 

where (p jff) may be typical power spectra of seismic signals for different frequency bands: cp iff) 
= ty(f - fk)> where fa are central frequencies of such bands. Herewith the fact that s(t) is a 
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broadband signal provides for q in eq.(2) to be small, i.e. the parametric representation in 

eq.(2) is efficient with a rather small number q of nuisance parameters. 

For the signal model b) where a seismic phase waveform is assumed to be deterministic 
time series, let us suppose that the all elements of this time series are unknown a’priory. That is 
in this approach in addition to informative parameters p—(p x >Py) we have N unknown nuisance 
parameters s(t), te 1,N. Thus, for this case the number of nuisance scalar parameters is equal to 
the number of vector observations x(t), te 1,N ’ 

The rigorous statistical approach implies that apparent slowness vector determination using 
records from a 3C array to be regarded as the statistical estimation problem involving nuisance 
parameters. To make the problem formulation fitting practical conditions we assume 
obseivations x(t), te 1, N to be correlated for both: different te 1,N and different coordinates x/t), 
je l,m of vectors x(t). We will suppose below that the matrix power spectral density (MSPD) 
F(f) of a noise is known. Such constraint can be justified in may cases by an ability to observe 
noise ^(/) records just before a wave phase arrival. Hence the spectral density of noise may be 
estimated by means of a special adaptation procedure, discussed in Section 2. The estimate 

F (f) can be used hereafter in algorithms of apparent slowness vector estimation. 

A synthesis of statistically optimal algorithms is based on a criterion of accuracy of 
estimation which must be properly chosen for a problem under consideration. Hereafter we use 
as the accuracy criterion for any apparent slowness vector estimate p N~(p X N>P yN/ hie 
asymptotic covariance matrix of the estimate 

Urn NE {(p * N - p) (p * N - p) 7 } = (3) 

The estimate p*wo f apparent slowness vector p which provides the minimal value of tr(^¥ p ) we 
will call as the asymptotically efficient (AE) estimate [17]. Note that trf¥p) is the sum of 
asymptotic mean square deviations of estimates (p x n> p y ^) from tiue parameter values 
(PxFy)' The criterion of asymptotic accuracy by eq.(3) allows one to use analytical techniques 
of the asymptotic estimation theoiy [2]. This enables one to derive up to the end both synthesis 
and analysis of the asymptotically optimal estimates and to gain an explicit estimation 
algorithms and formulas for their asymptotic covariance matrices. Note, that it is an 
unresolvable task (both from theoretical and practical points of view) to develop an explicit 
estimation algorithm, which would be the best in terms of any non-asymptotic accuracy 
criterion, for example, an algorithm having the smallest mean square deviations for any finite 
sample size N. 




















19 


3.3.3. Asymptotically efficient estimates of apparent slowness vector 

for random signal waveforms. 

Under assumption that the signal s(t) is a stationary zero mean Gaussian random process 
a vector time series x(t) being observed is a multidimensional zero mean stationary Gaussian 
time series, which distribution is entirely determined by its matrix power spectral density 
(MPSD) Fyff), fefO, fs/2], where f s is a sampling frequency . With the natural additional 
assumption that a signal waveform s(t) and a noise %(t) are statistically independent it is easy to 
derive that 

F X (J) =F(f)+ H w (f,p, V)<p s (f,c) (4) 

where: H(f,p, V)=h(fp, V)h *(f,p, V); is the [3mx 3m] matrix; 

h(f,p, V) =(expf-i2nf(uj T p)]b(p, V), /=], m) is the J/w-dimensional column vector of 
medium frequency responses along paths of seismic wave propagation from the first array sensor 
to the other ones; <p s (f,c) is a signal power spectral density which has a linear parametric 
representation in accordance with eq.(1.2). Note, that the MPSD Fyff) explicitly depends on 
informative and nuisance parameters of the problem under consideration. 

It has been demonstrated in [16,17] that if a probability distribution of observations satisfies 
rather weak constraints the asymptotically efficient (AE) estimate p of informative parameter 

p can be obtained (simultaneously with the estimate Q*n~( c *n> F*n) °f the nuisance 

parameter 0 =(c, V) ) by means of the maximum likelihood approach. According to this 
approach 

(P *n, 0 *n) = arg max (L(X N ,p,Q)); (5) 

p,Q 

where L(Xpf,pfi) = ln(W(Xpr,p,Q)) is the logarithm of multivariate probability density of 
observations, Xjy=(xfi,...,xpfi )^is the combined column vector of all observations. 

Let us construct a computational algorithm for the AE estimate of the parameter t> =(p,c, V) 
based on observations that fit the model eq. (1). A likelihood function for the observations Xpf 
is the logarithm of Gaussian multivariate probability density of Xjy , regarded as the function of 
parameter \). It can be represented a 

L(Xj\!,X)) = - (MN/2)In ( 2n) - (l/2)ln(det{C^(\))}) - (l/2)X 1 f^Cjp f o )Xj\r (6) 
where Cp/ is a block Teplits [mNx mN] matrix, composed with the blocks 
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C x = E{x(t)x T (t+x)} = E{y(t)y T (t+x)} + E{/(t)% T (t+z)} 

The likelihood function (2.3) depends on the problems parameters via elements of the inverse 
matrix C^j/v) and its determinant. It's clear that should the value of mN be of any significance 
the construction of a computational procedure for maximisation of L(Xm,x>) seems to be a 
difficult task. However, a simple approximation of this functional in the frequency domain is 
possible if a number of observations N is sufficiently large. 

Let us treat the observations which fit the model eq.(l) in the discrete frequency domain 
provided by Discrete Finite Fourier Transform (DFFT) [3]. In this domain the observations are: 

Xj = h/p, V) Sj + + 0/1/ N) j=l,N (7) 

where N is a number of discrete observations; xf^x(t), and are DFFT of x(t), and %(t), 

te 1,N, consequently, h/p, V)=(exp(fkjf s u k T p)b w (p, V), k<= l,m); f s is the sampling frequency, 
Xj—(2izj/N); Sj <^>s w (t) is a discrete complex spectrum of a seismic phase waveform. It is 
known, that DFFT frequency observations xj have weak mutual correlations for large N even if 
noise observations themselves are significantly correlated in time. I.e. DFFT appears to be an 
asymptotically decorrelating transform. If to ignore a weak statistic dependence of Xj for 
different j (i.e. to drop the terms 0/1/N) in eq. (2.4) one can get the following approximate 
expression for the likelihood function eq.(2.3) in the discrete frequency domain: 


N N N 

L(X N ,V) = - C + X X hdet Q/p,Q) +X \</pM 2 (8) 

j=l j=i j=l 


where 



N 

In det Ff 1 + X x *j Ff 1 x j th e term independent on the parameter u; 

j=i 






















21 


Note, that Zi,-»,Zn are descrete frequency domain output of the conditional optimal Wiener 
group filter applied to the array data x(t). The latter transforms a multichannel input signal to a 
scalar trace and maximises the signal/noise ratio along this trace under the condition of 
whitening output noise. 

It is rather easy to derive an analytical expression for the limit of normalised Fisher matrix 
for parameters n [12]. It enable us to calculate the asymptotic covariance matrix for AE 
estimate of apparent slowness vector. 

An analysis of eq.( 8 ) reveals that if signal/noise ratio is sufficiently large values the (pf J (c) 
become negligible for all j in comparison with h */p, V)Ff } hj(p,V). Hence, the dependence of 
functional A(X^,p,c, V) upon an unknown signal spectrum (p j (c) is weakening up to completely 

vanishing when 9 '^—So if a signal/noise ratio is large enough the AE estimate of apparent 
velocity become close to the estimate maximising the following functional 

N 

A(X N ,p,V)=y \zj(p,V )\ 2 (9) 

JmmU 

j =1 

where 


Zj( P,V ) = 



h)(p,V)F: 1 x i 

(p,V)F7 I h j (p,V) 



is an output of the noise whitening optimal group filter [16], Fj—Ffaj/J are values of noise 
MPSD for DFFT frequencies. In other words, if a signal/noise ratio is large enough one must 
take into account only the noise matrix spectral density meanwhile an information on a signal 
spectrum become insignificant. 
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3.3.4. Asymptotically efficient estimates of apparent slowness vector 

for small signal-to-noise ratio 

The AE estimate of apparent slowness vector, considered in Section 3.3 is the best in the 
terms of asymptotic quality criterion by eq.(3). However it demands a numerical technique for 
finding a maximum of the functional eq.(8) that makes the computational algorithm rather 
labour consuming. The iterative estimation procedure is aggravated and slowed down by the 
necessity to maximise the functional eq.(8) in (q+3) parameters, q+1 of which are nuisance, e.g. 
actually unnecessary for the main estimation problem. 

An additional assumption that signal/noise ratio is small allows for a significant 
simplification of asymptotically efficient estimation algorithm. A mathematically correct 
formulation of the apparent slowness estimation problem under this assumption is to estimate 
parameters of a weak signal ys(t) in the following model of observations 

x(t) - (6//N) y(t) + te 1 ,N (11) 

In accordance with this model a matrix power spectral density (MPSD) of the time series x(t), 
te 1,N is equal to 

F x (f) = F(f) + ( 1 /Fn )H(f,p, V) £ c k cp k (f) (12) 

k-l 

It can be shown [12,17] that under weak restrictions on noise MPSD F(f) the likelihood 
function (LF) eq.(6) of observations by eq.(l), (2) has the following asymptotic representation: 

L(X n \c/<N,p,V) = -L(X n ,0) + c T b(X N ,p, V) - (1/2) c T T(p,V)c + a N (X N ,p, V,c), (13) 

where 

N 

b(X N ,p, V) = (1//N) X [\hj*(p, V)Ff 1 x j \ 2 - hj*(p, V)Ffihj(p, V)]q>j 

j=1 

is the vector asymptotically sufficient statistic for the small nuisance parameters c/)N 

N 

T(p, V) = (1/N) X [h/(p, V)Fj-ihj(p, V)] 2 m * 

j=1 

is the limit of normalised Fisher matrix for the small nuisance parameters c/\N 
[12,16,17]; a(Xpf,p,c,V) is a residual term of the LF asymptotic decomposition, converging to 
zero in probability as a random process in the space C[ px@] of continues functions from (p,Q) 
with a uniform metric, where © is a bounded set of nuisance parameters 6=(c,V), p is a 
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bounded set of informative parameters p; L(X^ f O) is the likelihood function of “pure” noise. 
Due to independence of this term on the parameters (p y c, V) it may be dropped hereinafter. 

As follows from eq.(13) the AE parameter estimate for obsexvations fitting the model eq. 
(1), (2) has the fonn 

= (Pn>®n) = argmax [c T 8(X N ,p, V) - (l/2)c T T(p,V)c]. (14) 

c,p,v 

The estimate d pf appears to be significantly simpler than the common case AE estimate 
obtained by maximising eq.(8). Indeed, by maximising eq.(14) in c with fixed p and Fit is easy 
to derive 

c N (p,V) = argmax [c r b(X N ,p, V) - (l/2)c T Y(p, V)c] = T ~ 1 (p,v)d(X N ,p, V) (15) 

C 

The substitution of eq.(3.5) into eq.(3.4) gives 

(PnXn) = argmax R(Xn>P>V) ( 16 ) 

py 

where R(X Ny p y V) = 5 T (X Ny p y V)T ~*(p, V)h(X Ny p, V) (17) 

Thus when a signal/noise ratio is small the problem of apparent slowness vector and seismic 
wave phase velocity estimation can be reduced to maximisation of the functional eq.(17) in p 
and FThis is essentially simpler task than the maximisation of functional eq.(8). It is attributed 
both to the less number of calculations for evaluating the values of R(X^ y p y V) as well as its 
derivatives by p Xy p y and V, and to the fact that the procedure of optimisation involves only 
three parameters: p Xy p y and V y instead of (q+3) parameters (c y p y V). As a rule this yields to a 
significant increase in the speed of iterative procedures of numerical optimisation. A 
maximisation of the functional eq.(17) becomes easier also due to the existence of simple 
explicit expressions for its partial derivatives of the first and the second order for parameters p , 
V 

With the purpose to investigate an asymptotic quality of estimates by eq.(16) one needs 

first to proof the a/ N -consistency of the estimates (p^Xm) , e.g. the convergence (pn,V n ) in 

probability with speed l/^N to the true parameters values (po>Vq). Then one needs to find an 
asymptotic covariance of apparent slowness vector estimate p N i.e. the limit 

tint E X) (,N(p N - po)(p n ~ Po) T = K p (po,c 0 ,v 0 ). (18) 

n—>oo 

Comparison of trK p (po,Co,Vo) with the lower bound for estimation errors guaranteed by 
asymptotically efficient algorithm eq.(8) enables one to determine a possible loss in asymptotic 
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quality of apparent slowness by estimate eq.(l6) comparing with the AE estimate. Such 
investigations may be performed based upon techniques developed in [12]. 

Note, that formula for asymptotically sufficient statistic (see eq.(13)) may be transformed 

to the following form 

8 (X N ,p,V) = ((1/4N) X (\Zj\ 2 - V 2 r /P >^ Vj) ( l9) 

j =l 

where r/p, V) = h*/p, VjFj^h/p, V); and Zj are detemined by eq.(10). One can see from 
eq.(10) that for the case of small signal/noise ratio the AE estimation procedure includes as an 
main part the calculation of output Zj,.-,Zn of the whitening group filter. 

3.3.5. Apparent slowness estimates for completely unknown signal waveforms. 

If the waveform s(t) of seismic phase is completely unknown (Section 3.3.1, model b) the 
spectral samples Sj, je 1,N in eq.(7) are completely unknown too and have to be considered as a 
set of nuisance parameters for the problem. If to drop small terms Of 1/\N) in eq.(7) then 
observations in the discrete frequency domain fit an non-linear regression model with unknown 
"repressors" Sj. As a number of nuisance parameters Sj tends to infinity with an increase of the 
number of data samples N, a reasonable question arises: do such formulation of the problem 
provides existents of some VA -consistent estimate, for which estimation errors tend to zero if N 
tends to infinity and the asymptotic covariance matrix eq.(3) exists. Hereafter we are to 
construct an example of such estimate by means of the maximum likelihood techniques. The 
proof of the )N -consistency of this estimate and analytical expression for its asymptotic 
covariance matrix may be obtained using the results of [12]. 

Let us derive a maximum likelihood estimate of apparent slowness vector p and wave 

velocity v for the case of completely unknown waveform Sj , je 1,N. We will use an approximate 
expression for the likelihood function in the frequency domain, similar to eq.(8). As Sj are 
deterministic (thogh unknown) complex values, and are Gaussian vectors then the same 
considerations which have lead to eq.(6) allow us to derive the following asymptotic expression 
for the likelihood function in the case under discussion 

L(X n | p, V,{s}) = C - (1/2) £ In det Fj - (1/2) £ ( Xj - hj (p, V)sj) *Ff 1 (xj - hj (p, V)sj) = 

j=i j=i 

= l f (X N | p, V, (sj) + 0/1/4N) (20) 

The joint maximum likelihood (ML) estimates of informative parameters p, V and unknown 
nuisance parameters S;, je 1,N are the solution of the following set of equations 
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3 I,(X N I p, V,(s)) = 0; lf(X N I p, V,{s» = o ; je 1,N; 


d Re Sj 


d Im Sj 


( 21 ) 




If (X N | P, K {s}) = 0 ; aex,y ; 


dv 


l f (X N \p,V,{s}) = 0; 


where Resj and IttiSj are the real and imaginary parts of complex signal spectral samples. 

Having positively defined Hermitian matrix Fj expressed in the form: Fj' 1 = Fj ~V 2 Fj 'V 2 
one can write the main term of eq.(4.1) like 


L(X n | p, V, {s}) = C - (1/2)^ In det Fj - (1/2) Y \nj - d/p, V)sj\ 2 ; 

j=i j=i 


( 22 ) 


where 


nj = Ff 1 / 2 X /, dj = FfV 2 h/p, V). 




j 


(23) 


The first subsystem of equations (21) is linear and as it is easy to verify has the following 
solution 


Sj (p, V) = [d'i (p, V)xj /1 dj (P,V)\ 2 ; h 1,N. 


J 


V 


7 


(24) 


Substituting Sj (p, V) into the second subsystem of eq.(21) one obtains (after some simple 
algebraic transformatins) that in the problem being considered the ML estimates of apparent 
slowness vector and wave velocity are solutions of the following set of non-linear equations 


N 


Pa (X N ,p,V) = Y Xj (—Aj(p,V)) Xj = 0 ; ae x,y ; 




dp a 


j' 


V 


(25) 


P (X N ,p,V) = Y x/ (~Aj(p, V)) x 

j=l dv 


7 


= 0 ; 


f 


where 


A/p, V) = 


I- 




F/Hj/pF) 

trF] y Hj(p,V) 


\ 


FT 1 
1 J 


(26) 




The all discussed above provides the important conclusion that ML-estimates of seismic wave 
apparent slowness vector p and velocity Fin the case of completely unknown waveform have to 
minimise the functional 


T(X N ,p, V) = Y x j * Aj (p, V) xj (27) 

j=i 

The functional eq.(27) has a clear geometrical interpretation. According to eq.(7) the 
signal frequency samples Xj belong (with an accuracy up to O/l/^lN) ) to some 1 -dimensional 
subspaces ^/p,V) of the complex m- dimensional xj -vector spaces CfK It is easy to demonstrate 
that the functional eq.(27) is the sum of squares of distances from the observations xj to the 
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corresponding subspaces T\/p, V), the distances are calculated in the metrics determined by the 
inner products 

(a,b) = a*Ff 1 b , a 9 be Cf 1 (28) 


Thus in the problem being considered the maximum likelihood techniques brings to the vector 
(p, V) estimate which is a generalisation of the known estimate by the orthogonal statistical 
regression method. 

By a simple transformation of the functional eq.(27) one can verify that the method 
described really provides exactly the same estimates of apparent slowness vector p and velocity V 
as the functional eq.(9),(10), which provides the AE estimates for random Gaussian signal with 


a large signal/noise ratio. It is evident, that eq.(26) and eq.(27) result in 


r (X NJ >, V) 


n £ ( \h* j(p,V)Fj 1 x j \ 2 

y x; F; J x; - y —- - —-— 

jZ J J J hj( p,V )Fj l hj( p,V) 


(29) 


The first term in eq.(29) does not depend on parameters (p, V), so the estimation of (p, V) is 
reduced to maximisation of the second term, which coincides exactly with the functional 
A (Xtf,p,V) by eq.(9),(10). This is the very important coincidence which exhibits the close 
connection between so different at first glance mathematical models of observations: the model 
of random Gaussian signal with an unknown power spectrum density and the model of 
deterministic signal with completely unknown waveform. This interesting (but natural) 
correlation of these models has been discussed in [12]. 
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4. EXPERIMENTAL STUDY OF ADAPTIVE ALGORITHM APPLICATIONS 
TO 3-COMPONENT VERY BROAD BAND SMALL APERTURE 

ARRAY DATA PROCESSING 

4.1 Introduction 

The well tested NORESS-type small aperture arrays are essentially oriented to the 
regional seismic monitoring [5] and are equipped by short period instruments which record a 
seismic energy above 0.5 Hz. For this reason there were not so mush studies of low frequency 

noise field coherency in the limits of these array aperture. The a-stations of the International 
Monitoring Network currently being deployed are expected to be designed as 3-component wide 
band small aperture arrays containing about ten very broad band (VBB) 3C sensors within an 

aperture about 1.5 km [9]. Prototypes of the a-stations were tested in the framework of the IRIS 

PASSCAL project. For example, the VBB subarrays of Pinjon Hat (USA) and Geyocha 
(Turkmenistan) experimental arrays can be treated as such prototypes. Fig.l shows the 
configuration of VBB subarray of the Geyocha array deployed not so far from Turkmenistan 
capital Ashgabad. The array having been exploited during 93-94 years. It was situated on thick 
sedimentary rocks and hence was affected by intensive seismic noise (especially in the low 
frequency range). This sedimentary basin has been subjected to intensive geological crumpling, 
so multiple folds have been formed. These folds and other medium inhomogenieties cause 
intensive scattering of seismic waves. For this reason the extraction of seismic phase waveforms 
and estimation of phase parameters relaying only on the wave polarization characteristics is very 
difficult task in this region. Employment of 3C VBB seismic arrays significantly gains the 
information about characteristics of complex wave-fields generated in the medium by seismic 
events and greatly facilitates event analysis. Registering of the three spatial components of 
seismic wave field by different array receivers allows to distinct the seismic phases based on their 
polarization characteristics and to extract the SH and Love phases which are not registered by 
1C vertical sensors. Observation of an event wave-field at spaced sites allows to eliminate 
influence of media lateral ingomogeneties by “smoothing” of their impacts on local wave-field 
behavior. This leads to the much higher accuracy of event parameter measurements. Besides, by 
analyzing seismic noise 3C array recordings one can study spatial and polarization 
characteristics of the noise wave-field. This provides the better noise suppression by the 3C 
adaptive processing procedures as compared with 1C ones. However, the data from 3C arrays 
should be handled with some care, because involving horizontal seismograms into processing 
one may in some cases diminish the quality of event analysis. This can occur, for example, 
while extracting waveforms of teleseismic longitudinal body waves. The seismic power of such 
waves is mainly concentrated in the veitical component of an event wave-field while seismic 
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noise fields often has the most power in the horizontal components. So signal-to-noise ratio in 
horizontal seismograms of 3C array can be rather low in such cases, and piocessing of the 
horizontal seismograms altogether with the vertical seismograms can lead only to deterioration 

of waveform extraction quality. 

Below we discuss results of analysis of the noise field in the Geyocha array region and 
adaptive processing of seismograms from several regional events lecoided by this anay. We aie 
grateful to Dr. A.Dainty from the Earth Division of US AF Phillips Laboratory who collected 
these array multichannel recordings and transferred them to computer network of Moscow IRIS 
Data Analysis Center. Owing to the Dr. A.Dainty help our research group has got the relevant 
real data for thorough testing and refinement of the adaptive 3-component array data piocessing 

technique being developed. 

Some characteristics of the events are enclosed in the Table 1 

Table 1 


Ev. 

n 

Origin 

time 

Source 

coordinates 

degr. 

Source 

depth 

km. 

Magni¬ 

tude 

mb 

Epicenter 

distance 

degr. 

Back 

azimuth 

degr. 

P-wave 

appar. 

veloc. 

Source type 

1 

07.10.94 

03:25:58.1 

41.66° N 
88.75° E 

0 

6.0 

23.74° 

71.29° 

11.46 

China nuclear 
test 

2 

10.06.94 

06:25:58.0 

41.69° N 
88.79° E 

0 

5.7 

23.77° 

71.22° 

11.46 

China nuclear 
test (6 3C st) 

3 

18.10.93 

13::57:14.6 

22.13° N 
62.85° E 

10 

5.2 

16.29° 

164.13° 

8.67 

Earthquake 
(11 3C st.) 

4 

02.10.93 

01:17:30.4 

39.07° N 
69.97° E 

14 

5.0 

9.35° 

79.37° 

8.04 

Earthquake 


The apparent velocities of the P-phase arrival were calculated based on the known source 
coordinates using travel time tables originated from the Jeffreys- Bullen Earth model. 

4.2. Geyocha array noise feature study 

As we noted above the main reason for employing of the adaptive statistical array data 
processing is the coherence feature of the noise field observing in the many practical situations. 
So the first stage of our experimental studies of the Geyocha array recordings was investigating 
of noise field characteristics. The typical Geyocha sensor noise recordings are shown at Fig.2a. 
The noise seismograms were registered at the N, E and Z components of the cential Geyocha 
seismometer ORGH. Power spectral densities (PSD) of the noise field components estimated 
using the noise realizations in Fig.2a are shown in Fig.2b for frequency range 0-10 Hz and in 
Fig.2c (with more resolving power) - for frequency range 0-1 Hz. We see that the PSD has four 
peaks at frequencies 0.07, 0.2, 1.5 and 3.5 Hz. The peaks in the low fiequency lange aie 
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produced by the storm microseisms. Their genesis is discussed below. The peaks in the high 
frequency range are split because as we guess, the resolving power of the spectral analysis was 
excessive in this range, that lead to statistical fluctuations of the PSD estimate. But may be these 
peaks are connected with some man made periodical sources. 

In Fig.2d the noise PSD for the central vertical sensor is compared with the PSD of a 
beam composed from noise records of all 12 vertical VBB instruments; beam was steered to Lop 
Nor China Test Site (azimuth 71.7°, app. velocity 3.55 km/sec). We see that the noise with 
frequencies below 1 Hz is not suppressed by the beamforming procedure. This indicates that the 
noise field has high spatially correlations for this frequencies. Fig.3a shows noise coherence 
functions for the vertical components of the most close located (B32, C22) and the most distant 
apart (NHB, SEH) array sensors. We see that the first function preserves magnitudes veiy close 
to 1 over all frequency band 0-1 Hz and the second function has values larger then 0.8 up to 0.5 
Hz. Fig.2d and Fig.3a convinced us that for extraction of seismic phase waveforms in the 
frequency band below 1 Hz one has to apply the adaptive optimal group filtering algorithm 
instead of the conventional beamforming procedure. At Fig.3b the noise coherence functions 
calculated for the different pairs of components of central Geyocha 3C seismometer are 
displayed. All three functions have the very low magnitude that give rise to doubts that the 3- 
component AOGF procedure provides much signal-to-noise ratio gain in comparison with 1- 
component one. In fact, our experiments showed that gain about 4.5 dB can be achieved (as it 
is theoretically predicted for the case of uncorrelated spatial noise components). Nevertheless, 
there was demonstrated that 3-component modification of the AOGF method is very helpful for 
separation of the seismic phases having main oscillation energy in different spatial components: 
longitudinal, transverse and vertical, if these phases are obscured by coherent seismic noise or 
the coda waves of the previous phases. 

The following analysis will be mainly accomplished for the frequency range 0.01-2.5 Hz, 
so let us study the noise characteristic for this range in more details. At first, note that for the 
frequencies exceeding 0.3 Hz noise PSD is almost the same for the all spatial components. This 
is valid also for the band 0.05-0.1 Hz. In contrast, in the vicinity of the peak frequency 0.2 Hz 
the vertical noise component has the power exceeding the power of the horizontal noise 
components more then 4 times. The first can be explained by the noise polarization elongated in 
Z-direction that allows us to suspect that this noise component is propagated as the longitudinal 
body waves arriving to the array from below. This conclusion is confirmed by the results of F-K 
analysis, discussed below. 

In the veiy low frequency band (below 0.03 Hz) the situation is opposite: noise power for 
the Z-component is about 10 times less then for horizontal components. This can be explained 
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by the physical nature of these noise oscillations caused by the fluctuations of the atmospheric 
pressure. The impact of these fluctuations on the horizontal seismometer components is much 
stronger then on the it’s vertical component [10]. 

The same power of the noise field components in the frequency band 0.05-0.1 Hz leads 
to hypothesis that this noise component could be the transient one propagating as the Rayleigh 
waves with the small elliptic factor. This conclusion also is confirmed by the F-K analysis. 

The spatial spectrum map (F-K map) for Geyocha array noise in the frequency band 
0.15-0.25 Hz (corresponding to the most strong spectral peak of the Fig.2c spectra) is depicted 
in Fig.4a. According to the map this noise component could be treated as a composition of 
body waves arriving with veiy low incidence angles. In regard of the tick sedimentary layers 
beneath the array it is not contradicts with the description of this component as a scattered field 
generated by the ocean storms. Judging the back azimuth of this noise component (230°) the 
“source” of this microseisms is situated in Indian Ocean. The F-K map for the other low 
frequency noise peak in the band 0.05-0.07 Hz (Fig.4b) testifies that this noise component could 
be composed with surface waves arriving from Caspian Sea (note that apparent velocity of this 
surface waves is slightly greater then typical Rayleigh wave velocity). 

Because of the small aperture and the high correlation of noise for different sensors one 
may erroneously declare that the Geyocha type VBB array is a bad instrument for analyzing 
teleseismic and far regional surface waves. Really, the beamforming method does not provide 
any improvement in SNR for these waves, their F-K analysis is hampered by the coherence low 
frequency noise. To demonstrate applicability of the AOGF method for the case we simulated 
seismograms of a Rayleigh wave generated by a teleseismic explosion and recorded by the 1- 
component Geyocha array. The wavefoims of seismograms from different array sensors were 
modeled as the Berlage pulses with frequency 0.06 Hz and duration 150 sec. The pulses were 
shifted in time as for the Rayleigh wave originated from the Chinese Lop Nor test site. Note 
that the central frequency of the simulated wave was chosen equal to the frequency of Geyocha 
coherent noise peak. The simulated seismograms were then mixed with real Z-component 
records of Geyocha VBB array noise. The mixture (with the power SNR=0.1) was processed by 
the AOGF in the frequency band 0-1 Hz. The results shown in Fig.5 allow to assert that the 
AOGF has in this case the great potentials: the Rayleigh phase SNR was improved in 40 times 
due to effective suppression not only the transient (0.06 Hz) but also scattered (0.2 Hz) noise 
components. 
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4.3. Simulating experiments with real Geyocha 3C noise recordings 

To assess advantages which the adaptive statistical approach has for 3C broad band array 
data processing in comparison with the 1C case we simulated Geyocha array 3C seismograms as 
produced by the Lg regional seismic phase with central frequency 0.2 Hz, arrival azimuth 71.7° 
and velocity 4 km/sec. Note that the signal frequency was chosen coinciding with the most 
strong noise spectral peak (originated from the scattered storm microseisms). The simulated 
seismograms were mixed with real 3C recordings of Geyocha VBB noise to provide the power 
SNR=0.1. The mixture was processed by the 1C (for vertical components only) and 3C 
beamforming and adaptive group filtering algorithms. The results are shown in Fig.6. Fig.6a 
demonstrates the output traces of the one-component beamforming (trace 2), AOGF (trace 3) 
and WOGF (trace 4) algorithms applied to the all 12 VBB vertical array channels (in the 
frequency band 0-1 Hz). Trace 1 is a record of simulated Lg waveform. We see that as for the 
previous case (extracting a teleseismic long period surface wave) the conventional beamforming 
fails to recover the signal waveform but the AOGF provides a high suppression of coherent array 
noise and the good extraction of weak seismic Lg phase only due to a difference in the signal 
and noise spatial characteristics 

In Fig.6b the results are shown for the same procedures applied to the vertical 
components of only four most distantly spaced VBB instruments: ORGH, NH, SWH and SEH 
(see Fig.l). We see that the micro array consisting only 4 1C stations also provides rather good 
noise suppression and signal extraction. But the quality of signal recovering while using only 4 
1C stations are significantly poorer than for the total Geyocha vertical subarray. 

Fig.6c shows the processing results for the simulated data from 12 3C stations. The 
quality of signal recovering here is approximately the same as for 12 vertical channels (Fig.6a) if 
do not take into account very long period (about 100 sec.) noise oscillations. The latter are very 
strong in the horizontal E-component noise recordings and are leaking in the output AOGF 
traces. So addition of horizontal instruments in this experiment apparently does not enhance the 
array processing capability because the 1C AOGF procedure already compensates all coherent 
noise and leave at the output only incoherent noise residuals. 

Fig.6d exhibits the processing results for data from the described above micro-array 
consisting of four 3C stations. The quality of signal waveform recovering in this case is close to 
one for the case of array consisting of 12 vertical instruments (Fig.6a). Though the number of 
seismic channels in the both cases are the same, a deployment of array consisting of 12 vertical 
stations is much expensive than deployment of four 3C stations due to the cost of additional 
station volts and wiring from volts to the central hub. 


























6 


The simulation experiments made with the Geyocha array modeled LF signals and real 
noise recordings allows to assert that VBB small aperture micro-arrays have the potentials to be 
a good tool for measurement of parameters of surface waves generated by teleseismic and far 
regional events. Expanding of adaptive processing technique to the 3C case provides the 
procedure for recovering of waveforms of low frequency seismic phases obscured by long period 
coherent noise; the good results can be achieved with the help of 3C micro-arrays consisting of 
4-6 3C VBB instruments. 


4.4. Analysis of Geyocha 3C seismograms from Lop Nor explosion on 07.10.1994 

4.4.1. P-phase arrival direction estimation. 

The simulations discussed above were made for the assumption that the medium in the 
vicinity of the array is laterally homogeneous one and the event waves are the plane ones. 
However the real medium beneath the array has very complex geological characteristics and this 
seriously hamper the analysis of event phase characteristics. The multiple scatterers generate the 
scattered waves arriving to the array sensors with small delays relatively the main phase wave. 
This leads to distortions of polarization characteristics of this wave. As result, even at the first 
seconds after phase onsets the particle motion has the form much more complicated then 


theoretically predicted one. 

The particle motion of the P-wave of event n2 from Table 1 is shown in the Fig.7. It has 
the elliptic feature that is peculiar to inhomogeneous and anizotropic media [1]. Table 2 
encloses the results of polarization analysis of the P-wave of event nl made for recordings of the 
12 VBB 3C Geyocha seismometers 

Table 2 


Seismometer 

Back azimuth a 

Incidence angle (3 

ORGH 

63.8 

18.4 

NH 

43.6 

11.2 

SWH 

44.6 

19.3 

SEH 

67.4 

13.9 

A22 

68.7 

19.2 

B32 

70.3 

16.0 

C22 

76.6 

12.2 

D33 

55.7 

12.4 

E22 

61.1 

16.0 

F32 

53.9 

18.1 

G22 

46.2 

17.9 

H32 

61.8 

19.6 

Mean values 

59.5 

16.2 

RMS error 

10.3 

3.0 
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We see that deviation of the azimuth and incidence angle estimates relative to their mean value 
are very high. This testifies that serious medium inhomogenety exists even inside the small array 
aperture (less then 2 Km). The mean value of the azimuth is about 22° less then the real 
azimuth value (Table 1). This can be connected with anomalous polarization of the longitudinal 
waves in the area which is already mentioned in [6]. 

At the same time the F-K analysis of the event P-wave-field made using recordings from 
12 vertical Geyocha seismometers (Fig.8) gave the P-phase arrival azimuth estimate equal 67.6° 
that is much closer to the real value equal to 71.3°. The 4° deviation of the azimuth estimate to 
the North can be explained by the impact of the great Tibet and Tjan-Shan mountain provinces 
at the path of the wave propagation. The F-K estimate of the P-wave apparent velocity equal to 
11.9 km/sec is also well corresponds to the value determined from the Jeffreys-Bullen travel time 
tables (11.46 km/sec). Some estimate excess of the theoretical value may be connected with the 
impact of low speed upper sedimentary layer in the Geyocha area. 

The estimation of P-wave velocity in the medium beneath the array can be done by 
comparison of the results of polarization and F-K analysis. Employment of the simplest relation 
sin$p ~PhV p gives the value V p =3.6 km/sec, that is significantly less than value 1^=5.5 km/sec 

assumed in the Jeffreys-Bullen Earth model. So low velocity value obtained compels us to 
suspect that for the V p estimation in the given case one should employ the Kennett’s model of 
the P-wave interaction with the day surface which takes into account the wave transformations 
while reflecting this border. 

4.4.2. Extracting of phase waveforms from background noise. 

The experimental study of 3C AOGF method effectiveness in comparison with 1C case while 
extracting the weak event P-phase waveforms from background coherent seismic noise was 
performed with the help of the following simulation. The Geyocha 3C recordings of the event 
nl P-wave (shown in Fig.9a for the central sensor components) were scaled and mixed with the 
Geyocha noise recordings (shown in Fig.2a for the same components) to get the signal-to-noise 
ratio SNR=1 (in the frequency band 0.01-2.5 Hz). The 3C seismogram of the mixture for the 
central Geyocha sensor is shown at Fig.9b. The vertical lines at this figure marks the time 
interval at which the explosion P-wave signal was inserted to the noise. For signal waveform 
extraction from noise we uses as total set of seismograms from 12 3C Geyocha sensors, as the 
seismograms from the subarray consisting of 3 outer Geyocha 3C sensors (NH,SWH,SEH) and 
the central 3C sensor ORGH. For data processing we used the following procedures: 

• -conventional beamforming procedure; 

• -traditional undistorting adaptive optimal group filtering procedure (AOGF1, Section 3.2.2); 
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• -AOGF-procedure with additional constrains to have the zero partial derivatives of the 
sensitivity function by x and y apparent slowness (AOGF2, Section 3.2.5); 

• -adaptive whitening group filtering procedure (AWGF, Section 3.2.2); 

The 1- and 3-component versions of these procedures was employed, being applied to 1C 
vertical array data set and to 3C array data set correspondingly. For noise adaptation the 
multichannel ARMA model was used with AR part order equal to 2 and MA part order equal to 
8. The noise matrix covariance function was estimated using multichannel data at the time 
intervals which does not contain the signal (the outer intervals in Fig.9b). 

The signal-to-noise ratios obtained in result of application of the all procedures described 

are contained in Table 3. 

Table 3 


Output signal-to-noise ratio 

Procedure used 

Total array 

4-sensor anay 

Z-component beam 

0.7 

0.64 

Z-component AOGF1 

27.6 

2.4 

Z-component AOGF2 

25.8 

1.2 

Z-component AWGF 

491.6 

154.7 

_____L-———- 

3-component AOGF1 

30.3 

5.5 

3-component AOGF2 

32.7 

4.7 

3-component AWGF 

731.1 

200.4 


The output traces of the procedures for 12 Z-sensor array are depicted in Fig. 10a, for 12 
3C-sensor array - in Fig. 10b, for 4 Z-sensor subarray - in Fig. 10c and for 4 3C-sensor anay - in 
Fig.lOd. Comparison of the results presented in Table 3 and in Fig. 10 allows us to make the 

following conclusions: 

a) The conventional one-component beamforming does not provide the extraction of the 

signal waveform in the wide frequency band of the study: 0.01-2.5 Hz. 

b) The optimal group filtering procedure extracts the P-pliase waveform with undistorting 
reproduction of all frequency components of the signal in the wide frequency band. The 
procedure provides the gain in power SNR in comparison with beamfoiming equal to =30 foi 12 

3C-sensor array and =9 - for 4 3C-sensor array; 

c) The results of the experiment do not reveal an advantage of the adaptive optimal group 
filtering with the additional constrains on the spatial sensitivity diagram (AOGF2) in 
comparison with the traditional AOGF (AOGF1). At the same time the AOGF2 procedure 
involves additional computational resources and more time consuming then AOGF1. 

d) The adaptive whitening group filtering procedure (AWGF) provides extremely high 
noise suppression and signal extraction by combining the optimal filtering in the spatial and 
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frequency domains. Even for 4 3C-sensor microarray it produced in the experiment the power 
SNR gain equal to 200. As it is seen from Fig. 10 the procedure preserves the most structural 
features of the phase waveform: the phase onset time is determined from the AWGF trace with 
the most precision; the appearance of the distinct wavelets inside the waveform is also not 
smoothed that allows to detect weak secondary phases on the background of the main phase 
oscillations, and so on. The only thing which can not to be guaranteed is the preserving of the 
phase power spectral density shape. This can hamper the estimation of the magnitude based 
on the AWGF trace. Nevertheless the hope exists to develop some method for correction of such 
magnitude estimates for the characteristics of the whitening procedure. 

e) The employment of 3C AOGF procedure instead of 1C procedure for phase waveform 
extraction from total Geyocha array recording set does not lead to significant improvement in 
SNR: 30-33 - for the 3C case instead 26-28 - for the 1C case. This can be explained by very low 
noise coherence between the 3C sensor spatial components and by many times larger amount of 
parameters which have to be estimated while 3C array noise modeling, in comparison with 1C 
modeling. Nevertheless, for the case of microarray consisting of 4 3C-sensors the 3C AOGF 
filtering provides the SNR gain in comparison with 1C variant close to the theoretically 
predicted 4.5 dB. 

4.5. Analysis of Geyocha 3C seismograms from Lop Nor explosion on 10.06.94 

The second event analyzed in our experimental study was Chinese nuclear test made on 
10.06.94 at the Lop Nor test site. For some technical reasons we have the event recordings only 
from six 3C sensors of the Geyocha array. These sensors are marked by the crosses at the Fig. 1. 
We see that the disposition of the sensors is close to be symmetric in respect to the straight line 
connecting the outer sensors NH and SEH. This allows to predict that the spatial sensitivity 
diagrams of the microarray being composed with these sensors have the central symmetry (in 
respect to the point p x = Py = 0) f° r any steering direction and that for steering directions close to 
perpendicular to the line of symmetry the array spatial sensitivity diagram is veiy wide. The 
effective subarray aperture in the latter direction is less then 0.5 km, so it is really a equivalent to 
a microarray. 


4.5.1. P-phase arrival direction estimation. 

The disposition of the Geyocha seismometers which recordings were available is very unlucky 
for the F-K analyses of signals arriving from the Lop Nor test site direction. For this reason the 
P-wave F-K map calculated by conventional algorithm applied to the Z-component subarray 
seismograms has the veiy smoothed maximum (Fig. 11). The estimates of wave arrival direction 
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(azimuth a=77.5° and apparent velocity Vp=10.9 km/sec) are farther from theoretically predicted 
values than for the discussed above Lop Nor event til (see Fig.8). The F-K estimates made 
using different time intervals of P-phase waveform demonstrated ratliei stiong vaiiability. We 
explain this by the poor sensor disposition combined with the complex structure of P-wave 
polarization. The latter preserves plausible linearity only during the first 4 seconds after the P- 
phase onset. The F-K map in Fig. 11 was calculated just for this interval. 

Such instability of the conventional Z-component F-K analysis stimulated us to employ 
generalized F-K analysis based on 2-components (horizontal) and 3-component seismograms 
recorded by the subarray. As it is seen from Fig. 12 the event P-phase has the lathei powerful 
horizontal components that is related with the complex medium structuie beneath the anay. The 
generalized 2-component F-K analysis taking into account only transveise phase oscillations 
produced in the case the F-K map shown in Fig. 13. The azimuth and apparent velocity 
estimates (a=69.5°, v ap =U.l km/sec) provided by this map significantly less differ from the 
theoretical values (oo = 71.3°, v ap — 11,5 km/sec) than in the case of conventional Z-component F- 
K analysis. 

The procedure of generalized 3-component F-K analysis of P-phase wave-field requites 
the information about longitudinal wave velocity V p beneath the array. If this information is 
absent the generalized 3C F-K analysis can be made repeatedly with diffeient suspected V p 
values. The V p value which provides the highest maximum of the F-K map can be regarded as 
the estimate of the real V p value. (Note, that this method is statistically well grounded). This 
procedure applied to the 3C P-wave seismograms of the Lop Nor event ti2 gave the V p estimate 
equal to 4.3 km/sec. The generalized 3C F-K map corresponding to this velocity value is shown 
in Fig. 14. It provides the azimuth estimate equal 67.7° which is closer to real value and almost 
the same as for the F-K azimuth estimate got for the Lop Nor event til (67.5°). However, the 
apparent velocity estimate obtained by this method turned to be excessively high. 16.1 km/sec, 
that is almost in 1.5 times larger then the theoretical value. This ensuie us that the simplest 
model of the P-wave interacting with the day surface border (employed in the tested version of 
the generalized 3C F-K algorithm) does not correspond the real Geyocha conditions. Possibly 
the best accuracy can be achieved if to employ the Kennett interacting model, with a property 
assigned factor |i in the relation V p =\iV s . This conclusion is confirmed by the result of velocity 
V p estimation based on polarization and F-K analysis of the set of 3C seismograms for Lop Nor 
event n2. The azimuth and incidence angle estimates calculated from the available P-phase 
recordings of 6 3C seismometers are listed in Table 4. 
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Table 4 


Seismometer name 

Back azimuth a 

Incidence angle p 

ORGH 

60.3 

22.6 

NH 

36.3 

9.5 

SEH 

39.8 

22.6 

A22 

66.8 

20.0 

B32 

74.0 

19.7 

C22 

71.6 

17.5 

Mean values 

61.5 

18.3 

RMS error 

15.2 

4.5 


The P-wave medium velocity estimate based on the simplest equation sin$=Vp/v ap is equal in 
this case: Vp = 3. 7, that is corresponds well to the value obtained for Lop Nor event nl. This 
value is too low and can not be explained by the properties of the upper sedimentary layer. 

Since the P-phase polarization even at the first seconds after onset moment is the 
strongly elliptic one (Fig.7) it is natural to employ for the generalized 3C F-K analysis the 
mathematical model for elliptic particle motion (really, the theoretical Rayleigh wave 
polarization model). This attempt provided the F-K map shown in Fig. 15. The P-phase azimuth 
and apparent velocity estimates got with the help of this algorithm (a=71.7°, v ap =\2.1 km/sec) 

are greatly close to the theoretical values (a=71.3°, v ap = 11.5 km/sec). 

Discussion above allows to assert that employment of horizontal component seismograms 
in the framework of the generalized F-K analysis provides significant improvement of accuracy 
of seismic phase azimuth and apparent velocity measurements based on data from 3C 
microarrays. The accuracy of such analysis in the case of 6 station microarray with effective 
aperture about 0.5 km is comparable with the accuracy of conventional Z-component F-K 
analysis in the case of NORESS type array with aperture 25 km. In any case the F-K analysis of 
microarray data provides much higher accuracy of arrival direction estimation then the 
polarization analysis of single 3C seismometer. 

4.5.2. Extracting of phase waveforms from background of noise and coda of previous phases. 

The wavetrain components of the Lop Nor event on 10.06.1994 registered by the central 
Geyocha 3C seismometer are shown in Fig. 16 for frequency band 0.01-2 Hz. The signal-to- 
noise ratio of event P-phase is rather high, so the employment of AOGF seems to be expedient 
and one should use the conventional beamforming. However, Fig. 17 shows that there exists a 
significant difference in the waveforms produced by the 1-component beamforming and 1- 
component AOGF procedures on the basis of subarray P-phase Z-seismograms in the frequency 
band 0.01-2 Hz. The decisive distinction is that the AOGF output trace reveal the thin 
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structure of the P-phase wavetrain: the sequence of distinct wavelets is seen in this trace. The 
secondary wavelets have almost the same form as the first one and can be interpreted as the 
reflections from the medium layer borders. These wavelets are almost not seen in the 1- 
component beam trace being hampered by the leaking of low frequency (coherent) noise. Note 
that the power of beam trace is about 2 times greater than for the AOGF trace. This is 
connected with the large width of subarray beam steered to the Lop Nor direction (the effective 
aperture of subarray for this direction is less then 0.5 km). For this reason the beamforming 
procedure collects the energy of waves arriving from a large range of directions, and accumulates 
besides the straight P-wave the multiple reflected and scattered waves generated by 
inhomogeneties of the medium beneath the array. The significant difference (in 2,7 times) exists 
for the maximal amplitudes of beam and AOGF traces, that leads to excess of 0.4 magnitude 
units while the event magnitude is estimated based on the beam trace instead of the AOGF 

trace. 

Involving into the AOGF processing the recordings of horizontal subarray components 
does not provide a significant signal-to-noise ratio improvement in the extracted P-phase 
waveform as compared with Z-component AOGF trace. However this allows to separate the 
different constitutions of the complex P-phase particle motion. Fig. 18 shows the results of 3- 
component AOGF filtering which extracts the waveforms of P-phase particle motions along the 
P-ray, in transversal (SH) and orthogonal (SV) directions relatively to this ray. The 3- 
component filtering algorithm used was based on the simplest model of the wave interaction 
with the day surface border neglecting the wave type transformations. One can see from the 
figure that already at the first seconds after the “pure” P-wave have arrived along the ray there 
exists the powerful motions in the orthogonal direction to the ray (SV-component). The 
oscillations of these two components are shifted in phase that leads to the elliptic P-wave 
polarization mentioned above. The power of the particle motions in the transverse direction to 
the ray is significantly less than in the other two directions and these oscillations can be lelated 
with the scattered wave fields. 

Modified version of the 3-component AOGF algorithm which considers the wave 
transformations on the day surface border produced the output traces depicted in Fig. 19. It is 
noticeable that the waveforms extracted for the longitudinal, transverse and orthogonal ray 
directions are exactly the same as for the simplest 3C AOGF version. The difference is only in 
power of the extracted waves which is in two times less than in the previous case, that is quite 
correct from the theoretical consideration. 

As it is seen from Fig.16 the broad band event wavetrain does not contain any evidence 
of the shear wave arrivals. The application of 3-component AOGF processing also has not 
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provided the extracting of the regional or teleseismic S-phases. We regard this as peculiarity of 
this explosion may be connected with its source physical characteristics. 

The 3-component seismogram of the central Geyocha sensor filtered in the frequency 
band 0.01-0.1 Hz is shown in Fig.20. The intensive Rayleigh and Love phase oscillations are 
explicitly seen in this figure. It is interesting that the 3-component AOGF algorithm gives the 
chance to extract the oscillations of these waves even in the broad frequency band 0.01-2 Hz. 
Before to demonstrate this we discuss the estimation of Rayleigh and Love surface wave arrival 
parameters. 

The 3-component subarray seismograms filtered in the band 0.01-0.1 Hz gave the 
possibility to estimate these parameters with the help of generalized 3-component F-K analysis. 
The output maps of the analysis for the Rayleigh and Love waves are shown in Fig.21 and 
Fig.22. These maps provides the following estimates: for the Rayleigh wave - a=66.1° and 
v ap =2.2 km/sec; for the Love wave - a=64.1° and v^=3.3 km/sec. The azimuth estimates for 
the both waves differ from theoretical value equal to 71.5°. This undoubtedly connected with the 
impact of Pamir and Tjan-Shan mountains which lie at the theoretical path of the wave 
propagation. The same reason can explain the low apparent velocities of the surface waves which 
differ from velocities predicted by Jeffreys-Bullen Earth model (3.0 km/sec- for Rayleigh and 3.5 
km/sec for Love waves). 

Fig.23 presents the output traces of 1-component and 3-component AOGF procedures 
applied to the subarray seismograms in frequency band 0.01-2 Hz. One can see that the 1- 
component AOGF steered to the azimuth 66.1° and apparent velocity 2.2 km/sec did not 
revealed any surface wave. At the same time the 3-component AOGF with the same steering 
detected the strong Rayleigh wave oscillations at the longitudinal output component and (with 
steering a=64.1, Vap—3.1 km/sec) - the intensive Love wave oscillations at the transverse output 
component. Note that the orthogonal output component of 3C AOGF did not show any surface 
wave oscillations (exactly as the 1C AOGF trace did). This attests to the explosion Rayleigh 
wave has the very low ellipticity and its vertical component have the less power than noise and 
body wave coda. 

The AOGF filter adaptation in the previous case was made using noise recordings before 
event P-wave arrival. If to include in the adaptation interval the P-phase and its coda (up to 500 
sec) then the 3-component AOG filtering of the surface waves gives the much better results 
(Fig.24). The 1C AOGF in this case also failed to detect the Rayleigh wave oscillations, but the 
longitudinal component of 3C AOGF revealed these oscillations veiy clearly due to suppression 
of the coherent P-wave + P-coda oscillations. The 3C AOGF extraction of the Love wave at 
transverse component turned to be the most effective: the high frequency P-wave + P-coda 
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oscillations were strongly suppressed and the multy-mode structure of the Love wave oscillations 
became clearly seen. Note that the orthogonal component of 3C AOGF also did not provide the 
detection of vertical Rayleigh wave component, though the P-wave power is suppressed there in 
great extent then for 1C AOGF. 

The last experiment with the surface waveform extraction was made in the frequency 
band 0.01-0.1 Hz with adaptation including the interval of event P-wave (up to 500 sec). The 
results of this experiment are presented in Fig.25. We see that in this frequency band the Z- 
component AOGF provided the refinement of Rayleigh vertical component waveform that 
allows to detect the wave multi-mode structure. The Z-AOGF trace and the orthogonal 3C 
AOGF trace demonstrates a very good coincidence of the Rayleigh wave oscillations that 
convince us that the noise impact is practically eliminated. Nevertheless, the 3C AOGF 
orthogonal trace looks slightly more noisy that is explained by the low frequency noise leakage 
into this trace from the original array horizontal seismograms (where this noise is always more 
powerful than in the vertical seismograms). Association of the Z-AOGF trace with the 
longitudinal 3C AOGF trace provides the possibilities for comprehensive analysis of event 
Rayleigh wave characteristics. It is noticeable that the amplitude of vertical Rayleigh component 
is almost in 5 times less than the amplitude of longitudinal component. This Rayleigh wave 
feature is obviously connected with some peculiarities of regional medium structure which 
should be the object of further study. 

Comparison of 3C AOGF transverse components in Fig.24 and Fig.25 confirmed the 
conclusion about the multi-mode structure of event Love wave. Some distinctions in local 
amplitudes at the traces we prone to explain by the distortions due to preliminary low band 
filtering of array seismograms. We would prefer to recommend for the further comprehensive 
analysis the Love waveform after 3C AOGF broad band processing shown at the last trace in 
Fig.24: this trace has almost the same signal-to-noise ratio, preserves the high frequency signal 
components and are free from distortions connected with low band filtering. 

4.6. Analysis of Geyocha 3C seismograms from Oman gulf earthquake on 18.10.93 

4.6.1 . Onset time estimation of the event wave phases. 

The seismograms of the earthquake recorded by the central Geyocha sensor are depicted 
in Fig.26. The several explicit wave phases are seen in the event wavetrain: P-phase manifesting 
at the all components, a share wave phase manifesting at the horizontal components in the form 
of SH oscillations and practically not visible at the vertical component and also veiy strong 
oscillations of the surface waves: the Love wave - at the horizontal components (starting from 
300 sec), and Rayleigh wave - at the vertical component (starting from 450-500 sec). 
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Due to known coordinates of the earthquake source and Geyocha array location it is 
possible to calculate the onset times for various regional and teleseismic phases by employing 
regional and global travel time tables. For the first one we employed the NORSAR regional 
travel time table: this was the single regional table being available in the course of our study. The 
teleseismic travel time table used was originated from JefFreys-Bullen Earth model. We realize 
that the NORSAR regional travel time table does not correspond to the medium structure in the 
Geyocha array region and this inevitably leads to deviations of calculated and observed onset 
times for regional phases of the event. The calculated onset times are presented in Table 5. 

The estimation of phase onset times was accomplished with the help of Maximum 
Likelihood (ML) algorithm which design and performance quality are described in [8,3,4]. This 
procedure is installed as the interactive tool in the SNDA System that provides great facilities 
for interactive seismogram phase picking. Fig.27 illustrates the procedure of interactive onset 
time estimation for teleseismic and regional phases of the event using E-component recording of 
body waves in frequency band (O.Ol-lO)Hz. The upper trace is the original E-seismogram, 
second trace is the result of this seismogram adaptive whitening, third trace is the onset time 
ML function calculated for the total wavetrain inteival: this is the first, exploratory onset 
determination which can give information about a number of strongest phases and approximate 
moments of their onsets. The next four traces are the ML functions calculated in the local 
intervals around the phase onsets. 

The same procedures were accomplished with the another two component seismograms. 
The results of the onset time measuring are collected in Table 6. 

Table 5 


Theoretical onset times 


Teleseismic phases 

Regional phases 

Phase type 

Onset time: h:min:sec 

Phase type 

Onset time: h:min:sec 

P 

14:01:06.39 

Pn 

14:01:02.92 

S 

14:04:07.60 

pg 

14:02:06.78 

L 

14:05:51.93 

Sn 

14:03:50.14 

R 

14:07:29.09 

Sg 

14:06:12.87 



Lg 

14:05:52.18 



Rg_ 

14:07.28.32 


Table 6 


Measured onset times 


Seismometer 

Phase type: 

component 

Pn 

Pg 

Sn 

S 

E-component 

14:01:07.05 

14:01:55.32 

14:03:28.77 

14:04:07.16 

N-component 

14:01:03.75 

14:02:00.47 

14:03:25.78 

14:04:28.09 

Z-component 

14:01:03.30 

14:02:00.07 

14:03:27.67 
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The analysis of tables above allows to make the following conclusions: 

The Pn phase onsets estimated using Z and N components have the delays 0.4-0.8 sec 
relatively to theoretically predicted value. The additional 3.5 sec relative delay of the Pn-onset at 
the E-component may be explained by impact of the teleseismic P-phase: this phase theoretical 
onset is close to the estimate got for the E-component (see Tables 5,6). 

The Pg phase is almost not visible at the seismograms, nevertheless it is reliably detected 
by ML onset algorithm at all components with excellent coincidence of the onsets at N and Z 
components. Pg arrival precedes the theoretical time for 6.5 sec, that we explain by disagreement 
of the NORSAR medium model with the real medium characteristics. 

The Sn phase is also poorly seen in the seismograms but undoubtedly revealed by ML 
algorithm at all components with good coincidence of the onsets at E and Z components. This 
testifies that the Sn phase is composed by as SH as SV oscillations. It also arrives earlier then 
the NORSAR travel time prediction, and time shift here is already equal about 23 sec. 

The teleseismic S-wave onset time estimated using the E-seismogram demonstrates a very 
good accordance with the value provided by the Jeffreys-Bullen Earth model: the divergence 
here is only in 0.4 sec. The great error of onset estimate made using the N-component is 
explained by the impact of low-frequency noise pulse in the vicinity of S-phase onset, this pulse 

is clearly seen at the N-component seismogram depicted in Fig.28. 

In connection with the described results of application of the ML algorithm for onset 
time estimation of weak regional phases the following recommendations can be stated: 

1) This would be helpful to install in the SNDA system the 3-component version of ML 
onset estimation algorithm which designed and tested by the authors of the papei [8]. The 
algorithm accounts the changes as in power spectrums of the seismogram components as in the 
polarization of these component oscillations at the moment of phase anival. So it could be 
preferable for applications in case of 3-component array data processing. 

2) While using the ML onset estimation algorithm it would be more expedient to perform 
the preliminary adaptive whitening procedure just at the same local time interval which is 
chosen for calculating the onset ML function. This promises some gain in accuracy of the ML 
onset estimation procedure, though leads to some time consumption when estimating the 
multiply phase onsets. 

Fig.28 shows the event body wave 3C seismograms of the central Geyoclra sensor with 
the margins of Pn, Pg and Sn time intervals chosen for calculating phase power spectra and for- 
estimating their arrival direction parameters with the help of wide-band F-K analysis. The power- 
spectra of these phases are shown in Fig.29a - 29c. As it is theoretically predicted, the Pn phase 
has the most high frequency content: its spectrum maximum is situated at 0.7 Hz but there 




















exists the second powerful spectral peak with frequency 0.4; the Pg phase has the single spectral 
peak at 0.55 Hz, and the Sn phase - at 0.33 Hz. The F-K analysis maps for these phases are 
presented in Fig.30a - 30c. The most impressive in this maps is that estimates of Pn and Sn 
waves azimuths are equal 184.7° and 183.8°, i.e. coincides with the accuracy less than 1° (the 
apparent velocities of this phases are V a p(Pn)=8.7 and Vgp(Sn)—8.0). Note however, that these 
azimuth values differ from theoretical one (equal to a = 164.1°) of about 20°. Such divergence 
can not be explained by the estimation error and obviously connected with the peculiarities of 
these wave propagation in the region. Note the both phases propagates along the boundaiy 
between the Earth crust and upper mantle and evidently this boundaiy has a laterally 
heterogeneous structure in the region. At the same time, the Pg-phase F-K azimuth estimate is 
equal to 169.1° and differ from the theoretical azimuth only in 5° (the Pg apparent velocity is 
Vap(Pn)=6.8). Because the Pg phase propagates within the Earth crust the mentioned lateral 
heterogeneity of the crust-mantle boundary apparently does not affect strongly on this wave 
propagation. This matter is already discussed in the papers [6,7], where was stated, that various 
wave phases demonstrate the different arrival azimuths in Central Asia region. However, the 
authors of these papers observed this effect only for the surface waves. 

4.6.2. Extraction of body wave phase waveforms using adaptive array seismogram processing. 

Fig.31 shows the example of Geyocha sensor 3C seismogram in the time interval 
comprising the regional and teleseismic body waves. The oscillations of different regional phases 
are not seen explicitly enough. Below we make the attempt to make more detailed conclusions 
about these phase waveforms. In Fig.32 the results are presented connected with Pn waveform 
extraction in frequency band (0.01-2) Hz by the beamforming and AOGF procedures using only 
Z-component Geyocha seismograms. Trace 1 is the result of conventional beamforming 
procedure steered to the Pn wave direction. We see in the trace the presence of strong low 
frequency noise oscillations. The trace 2 is the result of AOGF procedure with the adaptation 
interval preceding the Pn onset. Only very low frequency noise components lernain in the tiace 
and the Pg phase with onset time 120 sec is explicitly revealed. Trace 3 is the output of AOGF 
with adaptation made using the total interval of event wavetrain recording (0-1000) sec and trace 
4 is the same for adaptation made using the “tail” of the event wavetrain recordings succeeding 
the body wave interval (200-1000) sec. We see from these traces that the low frequency noise 
oscillations are completely suppressed and the thin structure of the Pn coda is revealed, one can 
notice the distinct high frequency wavelets starting after Pn phase at the moments 80 and 92 sec. 

Fig.32 shows the results of analysis of Pn wave oscillations produced by the 3-component 
AOGF using set of 3C Geyocha seismograms. We see that longitudinal Pn oscillations have 
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relatively low frequency: less then 0.4 Hz, but there exists strong high frequency SV coda in time 
interval (70-80) sec and high frequency elliptic share wave (SH-SV) coda oscillations in interval 
(80-92). Apparently these coda waves are some transformations of the Pn wave at the 
heterogeneity in a vicinity of Geyocha array. Thus we can assert that 1C and 3C AOGF analysis 
of the event Pn phase allows to reveal the thin structure of this wave and its coda waves in the 
broad frequency band which is not evident from original seismograms and conventional 
beamforming traces. 

Fig.34 and Fig.35 demonstrate the extracting of Pg phase waveform using Z-component 
and 3-component Geyocha array seismograms correspondingly. Traces 1-4 in Fig.34 are 
produced with the same procedures as the analogous traces in Fig.32, the difference is only that 
the group filters were steered to the Pg apparent velocity instead Pn one. Trace 5 in Fig.35 is the 
output of the AOGF with adaptation made using Z-seismograms at the interval preceding the 
Pg onset (0-100 sec.). Traces 1-4 in Fig.34 are veiy similar to traces 1-4 in Fig.32 that is 
explained by the small difference in the arrival directions of Pn and Pg phases which is not 
discerned by the small aperture Geyocha array. Trace 5 in Fig.5 provides the most clear 
impression about the Pg phase waveform in comparison with the another traces. From Fig.35 
one can get the impression about sophisticated polarization of the Pg wave oscillations which 
include the intensive motions not only in the longitudinal, but in the orthogonal and even 
transverse directions. This is consequence of the intensive Pg wave scattering along the path in 
heterogeneous Earth crust in the region. 

Fig.36 and Fig.37 presents the results of extraction of Sn and teleseismic S phase 
waveforms using ZC and 3C Geyocha array seismograms, correspondingly. Traces 1-5 in Fig.36 
are produced by the same procedures as the analogous traces in Fig.34. The difference is only 
that the group filters were steered to the S-wave arrival direction and AOGF adaptation in trace 
5 was made using time interval (0-180) sec, which precedes the Sn-phase onset. At all traces 
there are no evidence of the S-phase oscillations. It is explained by the low power of Sn-phase 
and the close to transverse polarization of teleseismic S-phase. Some hint on the appearance of 
Sn-phase can be get from trace 5 where the time series oscillations become evidently low 
frequent after the Sn-phase onset moment. This can be explained by the scattering of the 
relatively low frequent SH-polarized Sn-phase on the medium heterogeneity. The veiy different 
image presents the 3C AOGF output traces depicted in Fig.37. Though the regional Sn phase 
became apparent in transverse and orthogonal components only due to slight decreasing the 
dominant frequency of oscillations, the teleseismic S-phase oscillations are explicitly manifested 
in transverse and orthogonal components. In comparison for example, with the initial 
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seismograms of the central Geyocha sensor these oscillations are evidently cleared from low 
frequency noise components. 

4.6.3. Study of event surface wave characteristics. 

Fig.38 shows the 3-component seismogram of central Geyocha sensor rotated in 
direction of theoretical surface wave arrival. The veiy powerful Love and Rayleigh surface wave 
oscillations allow to investigate their polarization and spectral-velocity features. The results of 
nonlinear Flinn polarization filtering applied to the 3C seismograms of the central Geyocha 
sensor are presented in Fig. 39. At least three distinct oscillation modes are seen foi the Rayleigh 
waves (at the longitudinal and orthogonal components) and two modes - for the Love wave. The 
evidence of these mode existence can be observed in the “raw” traces of Fig.38, but the 
nonlinear Flinn polarization filtering exhibits in this example its undoubted helpfulness. 

Fig.40 demonstrates the results of the Love wave spectral analysis in the time and spatial 
domains. In Fig.40.a the power spectral density is depicted for the Love wave oscillations being 
observed at Fig.38 trace 2 during (300-600) sec time interval. As it follows from Fig.39, this is 
the spectrum of main Love wave mode. The spectrum peak is focused in frequency band (0.025 
-0.09) with maximum at 0.05 Hz. The detailed spectral analysis revealed that the signal 
frequency content smoothly changes form 0.025Hz for the beginning of Love wave at 300 sec to 
0,09Hz for the end of its main mode at 600 sec. Thus this Love wave mode has the strong 

dispersion: its frequency changes almost in 4 time. 

The Fig.40.b shows the wide-band F-K analysis map for frequency band (0.03-0.08) Hz. 

It revealed the arrival azimuth and apparent velocity of the Love wave equal to 06=157.6° and 

V ap = 3 km/sec. The azimuth of the Love wave arrival is less than theoretical one (164.1°) for 
6,5°, and demonstrates the very high consistency in different frequency bands. This follows from 
the results of multiple narrow band F-K analysis presented in Fig.40.c and Fig.40.d. The F-K 
analysis was performed for 6 equidistant frequency bands with width equal 0.1 Hz in the range 
(0.03-0.08). From Fig.40.c one can see that the Love wave arrival azimuth does not practically 
change depending on the wave period. This contradicts with the results of paper [7] where such 
dependence were mentioned, however, for the other arrival direction of surface waves. 

Fig.40.d presents the Love main mode dispersion curve i.e. dependence of apparent 
velocity upon the wave period. This curve was gained in the result of multiple narrow band F-K 
analysis mentioned above. The dispersion curve good corresponds to the theoretical one 
following from the Guttenberg Earth model. In Fig.41 the results are presented obtained after 
3C AOGF filtering of Love wave phase using 3C Geyocha array seismograms. Adaptation of the 
3C AOGF was made using array data at time interval (0-300) sec., preceding the Love wave 
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onset. The traces in Fig.41 allow to calculate the trajectory of Love wave particle motions in the 
ray transverse-longitudinal directions corresponding to time interval (350-600) sec. and 
frequency band (0.01-2)Hz. This trajectory is depicted in Fig.42.a and one can see that in the 
time interval of main Love mode waveform the Rayleigh wave interfeiing longitudinal 
component has the amplitudes in three times less than Love wave tiansveise oscillations. 

The power spectral density of Rayleigh wave longitudinal oscillations is shown in 
Fig.42.b. the spectrum was calculated using trace 1 of Fig.38 in time interval (400-650) sec. We 
see from this figure that the Rayleigh wave spectrum has the rather narrow peak concentrated in 
frequency band (0.05-0.l)Hz. with maximum at 0.08 Hz. The lesults of wide-band F-K analysis 
for time interval (400-550) sec in frequency band (0.05-0.l)Hz are shown in Fig.42.c. The 
Rayleigh wave arrival azimuth estimate is equal 168.8°, that exceeds the theoretical value 
(164.1°) at 4.7° and the anival azimuth estimate for the Love wave at 11.2°. (Note that the 
theoretical azimuth is almost in the middle between the above surface waves azimuths). Such 


great difference between the Love and Rayleigh azimuth estimates can not be explained by the 
errors of the F-K analysis accounting for the very high signal-to noise ratio in the both phases. 


Here we again have to refer to complexity of the Earth ciust and upper mantle in the region 
under study and propose to implement thorough investigation of the wave propagation 
peculiarities in this region, which attract the assiduous attention in regard of CTBT monitoiing. 


For the Rayleigh phase we can not produce the stable estimation of dispersion curve (the 
apparent velocity in depending on the period) with the help of multiple narrow band F-K 
analysis. It possibly is connected with the interference of the Rayleigh wave main and upper- 
modes which is strong due to rather low epicenter distance of the event. We plan to implement 
for this purpose the sonogram (time-frequency) analysis. 


The 3C AOGF filtering of the 3-component Geyocha array seismograms with the 


purpose to extract the Rayleigh wave oscillations at longitudinal and transverse directions 
produced the traces similar to ones depicted in Fig.41. For this reason we do not provide the 
additional figure with this traces. We used the extracted Rayleigh wave traces to calculate the 
trajectory of Rayleigh wave particle motions in the longitudinal and orthogonal (vertical) 
directions to the ray. This trajectory is drawn in Fig.42.d. One can see that the trajectory has the 
elongated elliptic foim with the ratio of ellipse axes equal to 4.3 and with the main axis having 
the angle about 30° with the longitudinal direction. This information can be used for 

characterization of the medium structure in the region. 
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Fig. 1. Geyocha seismic array configuration. 
Stations used in analysis of the explosion on 10.06.91 

are marked by crosses. 
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Fig. 2d. Power spectra of noise recorded by 
Z-component of ORGH (solid curve) and of a beam 
composed from noise records (dashed curve). 
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Fig. 3a. Coherence between Z-components of 
the nearest (B32 and C22) (solid curve) and 
the most remote (NH and SEH) (dashed curve) 
Geyocha stations. Noise records on 29.04.94 01:00:00 
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Fig. 3b. Coherency between different components 
of 3C-seismometer ORGH. 

Noise records on 29.04.94 01:00:00 
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Fig. 5. Extraction of simulated Rayleigh waveforms from real Geyocha 
noise using Z-component array recordings. SNR=0.1 
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Fig. 7. Particles motion for first 5 seconds 
of explosion P-wave on 3C ORGH station on 10.06.94. 

f = 0.01 -2 Hz 
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Fig. 8. Estimation of P-wave arrival parameters 
for explosion on 07.10.94. 
f = 0.8 -1.2 Hz 
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Fig. 11. F-K analysis map for P-phase of Lop-Nor event 

on 10.06.94. 
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Fig. 13. Map for P-phase generalized F-K analysis 
based on horizontal components. 
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Fig. 14. Map for P-phase generalized 3C F-K analysis 
with linear polarization model. 
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Fig. 15. Map for P-phase generalized 3C F-K analysis 
with elliptic polarization model. 
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Fig. 21. Map for Rayleigh wave generalized 3C F-K analysis of 

10.06.94. Lop Nor event. 
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Fig. 30. F-K analysis maps for earthquake regional phases. 

a. Pn-phase. 

b. Pg-phase. 

c. Sn-phase. 
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Base time: 18.10.93 14.00.00. 0 Fig. 32. Extracting of Pn-phase waveform based on Z-component array data: 

Seconds from start: 0.000 *) beam trace; 2) AOGF trace with adaptation on preceding noise; 3) AOGF trace 

with adaptation using total event wavetrane (0-1000) sec.; 4) AOGF trace with 
Thu Sep 12 22:48:43 1996 SYNAPSE Science Center adaptation using only wavetrane succeeding the Pn + Pg phases (200-1000 sec). 
















































o 00 


O O 
o 


_ Ba se time: 18.10.93_ _ 14.00.00. 0 Fig. 34. Extracting of Pg-phase waveform, based on Z-component array. Traces 1-4 
Seconds from start: 0.000 are produced by the same procedures as analogous ones in Fig.31. Trace 5 is the 
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Fig. 40. Study of Love wave spectral-velocity features. 

a) Spectrum of Love wave at time interval 350-600 sec. 

b) Wide band estimation of Love wave arrival direction for frequency range 

(0.03-0.08)Hz, corresponding to its power peak at time interval 350-600 sec. 

c) Variations of Love wave arrival azimuth in depending on period. 

d) Variations of Love wave apparent velocity (dispersion curve) in depending 

on period. 
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--- rig. 41. Broad band extracting of Love wave oscillations 

Seconds from start: 0.000 in frequency band (0.01-2) Hz using 
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Fig. 42. Study of Love and Rayleigh wave features. 

a) Trajectory of Love wave particle motions in the ray transverse and longitudinal 
directions for time interval 350-500 sec. and frequency band (0.01-2) Hz. 

b) Spectrum of Rayleigh wave at time interval 400-650 sec. 

c) Wide band estimation of Rayleigh wave arrival direction for frequency band 
(0.05-0.1) Hz corresponding to it power peak at time interval 400-550 sec. 

d) Trajectory of Rayleigh wave particle motions in the vertical plane along 
the ray for time interval 500-650 sec. and frequency band (0.01-2) Hz. 
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4.7. Detection and parameter estimation of explosion signal obscured by coda of strong 

interfering earthquake using data from small aperture array 

Capabilities of the adaptive statistical algorithms for analyzing data from small aperture 
seismic arrays (SASA) are illustrated below by algorithms application to the problem of 
detection and parameter estimation of a so called "hidden explosion" seismic signal. The SNDA 
system with its functional and operational facilities provided the convenient framework for the 
automated analysis of SASA data in this illustration. 

The possible scenario of avoiding a Comprehensive Test Ban Treaty is to perform secret 
nuclear test by triggering a nuclear device with the help of seismic signal from a rather strong 
earthquake. In this case the explosion wave phases are obscured by coda waves of the 
earthquake. Latter typically are much stronger than seismic noise at the observational sites of 
monitoring network and often do not allow to detect the explosion signal using data form single 
seismic station. Nevertheless, current estimating the noise field statistical feature by a SASA and 
processing SASA recordings by the statistically optimal algorithms provides a chance of reliable 
CTBT monitoring even in conditions of implementing the described avoiding scenario . 

Fig.7.1-7.3 illustrates results of application of the adaptive statistical processing methods 
to the "hidden explosion” problem. In this study we used multichannel seismograms from 
underground nuclear test at Novaya Zemlya site (24 Oct. 1990) and earthquake in Hindu Kush 
(25 Oct. 1990) registered by NORESS. The simulation a mixture of the NORESS seismograms 
from the above events to provide a "hidden explosion" signal obscured by an earthquake coda 
and further seismogram processing for the explosion signal detection and parameter estimation 
were made with the help of special SNDA script, comprising a variety of the SNDA stack 
commands and adaptive statistical procedures. 

Fig.7.l.a shows P-wave seismogram from Novaya Zemlya explosion (NZE) (trace (1)) 
and P-wave with coda wave seismogram from Hindu Kush earthquake (HKE) recorded by the 
central NORESS sensor. The seismograms were filtered in the frequency band (0.5-5)Hz, 
resampled, shifted in time and scaled by the SNDA stack commands. The simulated "hidden 
explosion’s" 25-channel NORESS data are displayed in Fig.7.l.b. This mixture of real NZE and 
HKE NORESS seismograms contains the NZE P-wave obscured by the HKE coda with the 
RMS SNR=0.5 and the onset time at 23 sec later the HKE P-wave arrival. This data is the raw 
material for the succeeded analysis. Note that the explosion signal is not recognizable in this 
seismogram mixture du to similarity of amplitude and frequency contents of the NZE P-waves 
and HKE coda waves. 

Fig. 7.1.c demonstrates the detecting of NZE P-wave on the background of HKE coda 
by the adaptive statistically optimal detector (ASOD) [8,11,12]. The detection procedure is 
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applied to the output of the beam steered to NZ site (trace (3)). The output of the same 
beamforming procedure applied to the “pure” HKE seismograms is shown at the trace (4) to 
compare it with the trace (3). One can see that conventional beamforming does not suppress the 
HKE coda waves sufficiently enough to provide the reliable signal detecting by the standard 
STA/LTA detector. At the same time trace (2), containing the ASOD time series demonstrates 
the presence of strong peak from NZE P-wave. This peak significantly exceeds the ASOD 
fluctuations caused by HKE coda wave oscillations. Triggering of the trace (2) with the help of 
the automatically chosen threshold (equal to the doubled root mean square value of the total 
trace) allows to detect reliably the NZE P-wave and to chose the appropriate time interval 
(containing suspected “hidden explosion” signal) as the object for succeeded thorough analysis 
(trace (1)). 

Fig.7.1.d illustrates capability of the adaptive group filtering (AOGF) algorithm for 
extracting of waveform of a weak seismic phase obscured by coda waves. The HKE coda is a 
strongly coherent one that yields in insufficient suppression capability of the conventional 
beamforming in this case (trace (3)). At the same time, just the strong HKE coda coherence 
allows the AOGF procedure to gain the effective coda wave suppression that is seen in the 
AOGF output trace (3). Trace (1) shows the output of AOGF applied to the “pure” NZE 
seismograms (to compare it with the trace (3)). One can see that the waveform of NZE P-wave 
is reproduced by the AOGF rather accurately. Note, that adaptation of AOGF algorithm was 
made using the NZE+HKE seismogram mixture at the time intervals (0-22) sec and (35-60) 
sec., i.e. before and after the interval containing the NZE P-wave. These adaptation intervals 
were automatically chosen in result of detection procedure described. 

In Fig.7.2.a one can see the output traces of the AOGF procedure applied to the “pure” 
NZE and to the mixture of NZE+HKE seismograms. Superposition of these traces by the 
SNDA graphic means (Fig.7.2b) allows to assess the distortions of NZE P-waveform which are 
produced by this waveform extracting from the HKE coda waves (note that it was rather difficult 
case where initial SNR did not exceed 0.5). 

Fig.7.2.c illustrates the estimation of NZE P-wave onset time with the help of maximum 
likelihood algorithm [8,11,12] applied to the AOGF output (to trace (3) in Fig. 7.1.d). Note, 
that this algorithm is realized in the SNDA in two forms: as the stack command (used mainly 

for scripts) and as the interactive graphic command. 

In Fig.7.2.d the power spectra of beamforming and AOGF outputs are shown. They are 
calculated for the time interval containing the NZE P-wave. The figure demonstrates the small 
distinction the spectrum of extracted NZE P-waveform (after AOGF) from the spectrum of 
initial NZE P-waveform. At the same time, due to small SNR=0.5 the beam output does not 
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Fig. 7.3. Estimation of explosion wave arrival parameters on background of earthquake coda 
(SNR=0.5; real parameters are: AZ=32.9°, V^IO.4 km/sec. a) Wide band F-K map for the 
wave mixture, b, c) High resolution F-K map for the wave mixture. Explosion parameter 
estimates are: AZ=26.6°, Vap=7.4. d) Adaptive F-K map for the wave mixture. Explosion 
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allow to catch any impression about the NZE P-wave spectrum, that makes it impossible to 

implement the standard source identification procedures. 

Fig. 7.3 presents the results of arrival direction estimation for the suspicious wave 
detected in the HKE coda. This estimation is the veiy difficult task and was performed in our 
study by means of various F-K analysis algorithms. (Let us emphasis that evaluation of seismic 
wave arrival direction altogether with estimation of P and S phase onset times provides the 
information needed for the location of event epicenter based on data from a single SASA). Fig. 
7.3.a displays the spatial spectrum calculated by the conventional broad band F-K analysis 
algorithm [12,13] using the NZE+HKE seismogram mixture at time inteival containing NZE 
P-wave. One can see that the spectrum maximum, which is the estimate of wave anival 
direction, only slightly differs from the real anival direction parameters of HKE P-wave 
(azimuth= 102.4° and apparent velocity=14.8 km/sec.). Application to the HKE+NZE 
seismogram mixture of the high resolution F-K analysis (the modified Capon algorithm with the 
estimation of array data matrix power spectrum by the multidimensional autoregressive-moving 
average modeling [12,13]) allows to detect at the time inteival being analyzed the two waves 
(Fig.7.b). The global maximum of the F-K map testifies to the presence of HKE P-wave. 
Measuring the second maximum location (made with graphic interactive SNDA tool, Fig.7.c) 
gives the estimates for the second wave arrival parameters equal to: azimuth=26,5°, apparent 
velocity=7.4 km/sec. These values are rather far from the real NZE P-wave arrival parameters 
equal to: azimuth=32.9°, apparent velocity=10.4 km/sec. 

At last, the advanced algorithm was implemented for the accurate estimation of NZE P- 
wave arrival direction from the NZE+HKE seismogram mixture. This is the adaptive Maximum 
Likelihood F-K algorithm for direction estimation of a signal plane wave arriving to a SASA site 
together with coherent interfering waves. The algorithm is described in Section 3.3. In our case 
the HK coda wave spatial characteristics can be analyzed independently at the time intervals 
which does not contain the signal wave, and the adaptation of ML F-K algorithm was made 
using this observations. The employment of algorithm resulted in the spatial spectrum estimate 
shown in Fig.7.3.d. The F-K map contains the single strong peak with the maximum located at 
the point with azimuth=31.0° and apparent velocity^S.6 km/sec. These values are more close to 
the real anival parameters of the NZE P-wave than those obtained by the high resolution F-K 
analysis. Note also that the high resolution F-K analysis demanded in our case the rather thin 
fitting of the program parameters to obtain the impressive map depicted in Fig.7.b. At the same 
time the adaptive ML F-K algorithm is much more robust in applications to broad band array 
seismograms. 
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5. FURTHER DEVELOPMENT OF THE SNDA SYSTEM 

Within the framework of this study the problem oriented computer shell 
SNDA was significantly advanced to provide the high level facilities for the scientific 
investigations involving processing a large amount of experimental data. A high level 
object oriented language was developed to support the full and complete processing 
the data. The three new original color graphic packages were created and 
incorporated into SNDA: “Surfer” - for plotting and graphic analysis the different 
spatial diagrams and other surfaces, “Map” - for depicting and analysis of mutual 
dispositions of local seismic network stations and seismic events sources on a 
geographic background and “Cluster” - for graphic support of the cluster analysis in 
procedures of seismic events discriminating. Besides the set of Stack commands was 
significantly expanded and some of them were improved with account of SNDA 
operation experience accumulated at SYNAPSE Science Center and other institutes 
for the last years. 

5.1. New high level language 

The SNDA job control language (JCL) is developed to facilitate an 
implementation of SNDA Stack commands and problem oriented data processing 
procedures (SA-procedures). JCL is a high level problem oriented object language 
(such as well known MATLAB language). The difference is in the entity of objects. 
Here we deal with seismograms and other time series, spectra, matrices, studying 
their structures and properties in the course of processing procedures. Being intended 
for mass data processing in seismology the language meets demands of the 
conventional structure programming. He possesses the well known syntax 
constructions, which allows to compose the clear and plain (without goto) programs. 
As the result the program robustness, flexibility, and modification simplicity is 
guaranteed. JCL provides a user with possibilities to perform various graphic 
interactive actions during the program execution after commands plot and pause. This 
extremely facilitates comprehensive analysis of the great volume of experimental data. 























The main JCL constructions are: 


block ... endblock 
perform 

when ... elsewhen ... elsewhen ... else endwhen 
for ... endfor 

Statements block ... enblock together with operator perform allows to separate 
some completed logical unit into the isolated part of program. 

Construction when ... elsewhen ... elsewhen ... else ... endwhen provides the 
choice of some variant from several ones in dependent of condition. 

Construction for ... endfor provides organizing the cycles. 

Below is an example of the program written in SNDA JCL. 

int i r j 

char tx[10], time [12] 
perform initial 

for (tx = "NRJ",i=0; i<28 ; i=i+l ) 
for (time ="12.01", j<3 ; j=j+l ) 

perform processing 
perform savereslt 
endfor 
endfor 

perform estimate 
return # main program 

block initial 

# operators 
endblock 

block processing 

float aaa,bb,cc 
int ee,ff,gg 

readcss30 /detseis/seis/israel/data/exposions/9410031539 
filterC (3-7) 0.1 30 0.12 4 

powspec (3-7) 100 5 

plotspec 5 -AOGF -expl.cmnt 
# other operators 
endblock 

block savereslt 

savevar plot/kush/parms.bb a c k 1 

savesnd plot/kush/values.bb sndil sndi2 sndflO sndfl sndc7 
endblock 
block estimate 

# operators 
endblock 
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The programs written in the SNDA language, (in contrast to problem oriented 
C or FORTRAN procedures) we refer as script. The structure operators are the 
backbone of any script while the specific processing functions are performed by the 
language commands classified onto three groups: SA-procedures, stack commands, 
and control statements. Note that even a sophisticated SA-procedure in terms of the 
SNDA language stand out as a simple JCL command. This is a main peculiarity of 
the object languages. 

The control statements mainly operate with the Black Board Variables (BB- 
variables). The latter allow to control a data analysis by modifying parameters of 
procedures (or even algorithms at whole) in depend on intermediate processing 
results. This possibility is realized by the implementation of the BB-variables in the 
Stack commands (with predecessor "&") instead of numerical or character operands. 
The current values of BB-variables substitute the operands (or arbitrary paits of text) 
in command. Thus they allow to manage a communication between SA-procedures 
during a script execution. By making computational cycles and choice operators 
dependent on current values of BB-variables a user can change a sequence of program 
processing steps (e.g. to execute or skip any SA procedure or Stack command in 
accordance with results of previous steps). 

The JCL control statements serves for declaration of BB-variables, their 
initialization and modification as well as for change an order of execution of SNDA 
commands in dependent on BB-variables values. The following JCL control 
statements exist now in the SNDA: BBV declaration, BBV assignment, BBV print 
and save, statement label, conditional and unconditional goto. Besides there exists a 
possibility of interactive graphic assignment of BB-variables during measuring of time 
and amplitude trace characteristics in main graphic window (see 6.6, 6.7 of the 
appendix). Unlike the Stack commands and SA-procedures, JCL Control Statements 
must have the "/'(dot) character at the beginning of the Statement string. 

The one-dimensional arrays of the float, integer and character BB-variables are 
suppoited in the JCL They are also accessible within SA-procedure by the special system 
call. The following conventional algebraic functions may be used in expressions for 
assignment of BB-variables: 

trigonometric: sin, cos, tan, asin, acos, atan; 

hyperbolic: sinh, cosh, tanh, asinh, acosh, atanh; 

other: log (natural), exp, sqrt. 
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Below an example of such expression is shown. 

r = 1- sin ((m [m [a* sqrt((c-b) **2) ] - 6.77] -5.11)/2 *pi/6 ) 

A length of any script Statement must not exceed 125 characters. Statement 
operands are to be separated by the blanks. Several sequential blanks are considered as 
the one blank. Empty lines can be set among script statements for the better script 
fashioning. Any statement, starting with the "#" (hash mark) character, is considered 
as a comment and is skipped during a script execution. 

After a script is started the SNDA performs its preprocessing to examine the 
script structure and convert it into the sequence of primitive statement (with goto 
operators)'. As a result a new intermediate script version is composed and saved in 
current directory snda/sun4, which composed of primitive operators. The new name 
of this file is ended with .mid. Then SNDA performs the syntax checking of the 
commands, verification of variable specifications and the validation of correctness of 
their applications. All needed diagnostic messages are displayed on command 
window. If the script text is correct the direct access file is mapped into computer 
memory and executed. This concept provides multiple jumps to labels inside the script 
without time losses. 

Before being executed, eveiy script statement with its number is displayed on 
the console. The statement echo enables to protocol the desirable BB-variables in the 
process of script execution. 

If the pause (or plot) command is met the script execution is halted (for the 
plot case - after displaying the Stack traces in the SNDA graphic window). To resume 
execution a user has to click the GO button in the Stack command window. While 
script is halted a user may perform some interactive actions: measuring of trace 
characteristics in the main graphic window and after that to modify BB-variables. 
This allows to manage the changes in the further script execution. 

Detail description of the language is presented in the Section 5 of the 


appendix. 
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5.2. Surfer 

Surfer is a graphic program package for plotting the images of functions of 
two-dimensional variable in forms of different surfaces and maps. The program 
creates a special color graphic window for plotting the function images. The latter 
may be a three or two dimensional. The two-dimensional image may have two types: 
in the form of isolines of levels, drawn by the different colors (contour image) and in 
the form of topographic map with a gradual transformation of wide spectrum colors 
(map image). 

Any plot is created by the executable module "surfer", that reads a control file 
(as the well known UNIX packages ’’contour", "plotxy”, and "gnuplot" do). A name 
of the control file is a single (and obligatory) argument of the "surfer" command. A 
structure of the control file is described below. 

After the surfer starts a graphic window with a color 3-dimensional figure 
emerges. A user may push the Opt button to prescribe some additional parameters of 
the image (see below) and then push the 3d button (or right mouse button) to replot 
the image. A user may also rotate figure around vertical and horizontal axes in the 
range (-90°, +90°) - by mowing the middle mouse button or turn the image back by 
setting an appropriate option. In order to get a contour image a user has to push 
Contour button, while to get a map image - push the Topo button. After plotting the 
color image a user may create a PostScipt file and view its image on the screen. For 
tills purpose he has to push the Create PS button (the name of PostScrpt file may be 
also prescribed in the corresponding panel string ), after that a new window is created 
with a PostScript (color or monochrome) image. The PostScript file can be printer 
using the standard UNIX command. 

A user may currently coixect the surfer control and data files without exit the 
program surfer. He may also change the names of both files and then push the Read 
button. A new control file and corresponding data-file will be read. After renewing the 
data a user may continue the graphic analysis of the different function images. 

A user may fit a color palette of the objects on the screen (3D-figure, contour 
or topo maps) in accordance with his own taste. For this purpose he has to click the 
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Palette button The changes of palette, made by the user, are retained to be valid for 
the PostScript image too. 

A user may exchange the X,Y axes or invert the image along the axis. 

A user may cut the desirable horizontal layer of the 3D image. The clipped 
layer is automatically expanded up to whole range of Z-axis. This mean enables one 
to make the thorough graphic investigation of some rather thin layer in details, if the 
ordinaiy image of the whole figure does not display a detailed configuration of the 
layer of interest. The mode providing this option is set by the Cut button. 

A user may graphically measure the values of arguments as well as the function 
in any point of surface. Measuring of point coordinates is provided in the Contour 
and Topo modes This allows for example to provide the graphic analysis of the 
surface near its different local extremums. The mode for coordinate measuring is set 
by the Measure button. 

In the Contour or Topo mode there exists a possibility to make a zooming. So 
the little piece of surface may be expanded up to the whole image box and total color 
palette will be used in the limits of chosen figure pait. This enables to achieve the 
more high precision of coordinates measurement. The standard X-window resize 
function is realized, so a user may significantly enlarge the image to provide the 
detailed graphic investigation of some local region of interest. At last a user has an 
opportunity to compose a collection of the images on one page (in the single 
PostScript file) in order to prepare the illustrations for reports or scientific papers. 
Detailed description of the Surfer is presented in the Section 7 of the appendix. 

5.3. Map 

The package is intended for the graphic interactive analysis of the disposition 
of the local network stations and possible regional seismic event on the geographic 
background. The two coordinate system are used simultaneously for the image: 
Geographic and plane Gauss-Kruger one. The center of latter is supposed to be 
moved from equator to the geometrical center of the region. To convert the 
Geographic coordinates to the Gauss- Kruger system and back the approximate 
polynomials are used. The distortions of graphic distance measurements of points on 
the plane depend on the point separation from the central meridian of selected zone. 
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They are close to zero when measuring is made along the meridian and rise 
hundreds meters close to borders of a 6-grades zone. The further expansion of the 
zone causes the rather large distortions. 

Nevertheless the package provides rather exact measurements of a distance 
between two arbitrary points over the ellipsoid arc on the earth surface (deviation is 
less then 0.1 m.) regardless of positions and separation of these two points. 

The package provides the following facilities: 

- selection of the needed stations from the total station list using special characters 
masks, applied to the station labels; 

- plotting the selected station network and selected set of event epicenters on the 
geographic background forming the file with the station selected list; 

- reducing the scale of the image in order to plot the rather large geographical area 
around the station network to include the event sources under interest.; 

-zooming the selected local zone to display the more detailed configuration of the 
station and event groups; 

- graphic interactive measurement of the selected point coordinates as well the 
distances between two arbitrary points using one of the two measuring modes: over 
the plane and over the ellipsoid arc; 

- creating the file with the network station epicentral distances relative to chosen 
event (for further sorting the multichannel seismogram); 

- drawing the different geographic objects on the map: coast lines, states boundaries, 
rivers and so on. 

- creating the PostScript file to be printed out onto the laser printer. 

Detailed description of the Map package is presented in the Section 8 of the 
Appendix. 
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5.4. Cluster 

Cluster is a package for graphical support of the source identification 
procedures. It provides imaging of the numerical event feature vectors in the 3- 
dimensional coordinate space. This enables user to study interactively the 
discrimination capabilities of the used discrimination features. 

The learning events of every class have to be contained in the separate files in 
which every event is to be represented by the one feature vector, composing the row 
of values in integer, float or exponential format. Thus the file may be regarded as 
matrix of N row and M columns, where N - is amount of events in this class and M- 
is a total number of features (the same for all events). The unknown events to be 
attributed to one of the classes must be also in the separate file of the same structure. 
The total number of features must be more or equal to 3. 

The System creates a special color graphic window with an image. The plot is 
created by the executable module cluster, which read a control file. A name of the 
control file is a single (and obligatory) argument of the cluster command. 

After the cluster starts a color graphic window arises with 3-dimensional box 
with color point clusters. Each event is depicted as a point, which symbol and color 
correspond to the appropriate class. A user may push the List button to view the list 
of events. Then he may rotate a box by the moving middle mouse button to achieve a 
maximum separation of the event clusters for known classes. The position of tested 
event point in this plot enables user to classify event, i.e. to attribute it to the one of 
learning clusters. 

There exist two alternative ways to initiate the cluster program. 

1) A user defines in control file three desirable components of feature vectors 
as first operands of the corresponding operators: xlabel, ylabel, zlabel. Here he defines 
also the linear or logarithm scale mode for each axis and the name of feature applied. 
The cluster image appears just after starting. 

2) A user does not define in control file the specific numbers of desirable 
features. The program starts with empty graphic window without any image. The 
selection of desirable features is performed interactively in the special “Select 
features” window. To do this a user must push the Select button of the “claster” 
main menu, after that a new window arises witli the total list of features numbers 
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positioned in the small boxes. The user should select the desirable three numbers by 
the left mouse button (the boxes selected are highlighted by the red color), set an 
appropriate scale mode, click the corresponding choice buttons, and push the plot 
button or right mouse button. The cluster image will appear. By this interactive way a 
user may easily study many combinations of features to achieve a maximum 
separation of the event point clusters for different classes. 

The cluster program recognizes both ways with the help of labfile operator. If 
labfile operator is present in the control file, then it is supposed to be the second way 
of stalling. If labfile operator is omitted (but xlabel, ylabel, zlabel is applied), then the 
program chooses the first way. The standard X-window resize function is realized in 
the program, that allows to adjust the convenient size of the “cluster” window. 

After plotting the image at the screen a user has a possibility to compose and 
preview a PostScrpt file of the image. It is made by pushing the Form PS-file button 
(the name of file may be also prescribed in corresponding panel string ), a new 
window is created with postscript image. The Postscript file with the assigned name 
can be printed then using the standard UNIX tool. As in the “Surfer” package a user 
can compose the collection of the images on one page. 

Detailed description of the Cluster package is presented in the Section 9 of the 

Appendix. 

5.5. Other developments 

5.5.1. The most important Stack trace processing commands are included to 
the interactive graphic toolkit to the main graphic menu of the multichannel graphic 
frame. The new button GP (Graphic processing) opens the next processing 
procedures: onset time estimation, direct fast Fourier transform, power and cross 
Spectra estimation, filtering by the different methods (see Section 6.13 of the 
Appendix). All Stack commands with this names are also retained. 

5.5.2. The more convenient way to select channels and time window is 
developed in addition to the old way: a user may select channels and simultaneously 
set a new window in “TW-mode” by drawing a rectangle in the graphic window with 
the help of middle mouse button (see Section 6.5 of the Appendix). Besides Magnify 
mode is significantly advanced (see Section 6.9 of the Appendix). 

5.5.3. Some new stack commands are added (see section 4. of the Appendix): 
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read/save commands: savestack, readstack, readtab, readimpk, readdem; 

window handle commands: winl , winmax; 

trace handle commands: stcopy; 

trace arithmetic commands: mult, addc; 

trace processing commands: abs, receipr, sqr, msqr, sqrt, msqrt, power, In, 
log, mean, meansqr, meanabs, tmax, tmin; geos (composing a new trace with cosine 
signal); 

network trace sorting commands: sort, synchro, episort, episortl, episort2 
displaying and plot commands: flist, surfer map cluster; 

5.5.4. Some stack commands are improved, namely: 
plotspec - frequency interval may be set and comments are provided; 
savetab, savepack, savefloat are performed in regard to current winon. 

5.5.5. The channels-operand for stack commands has now a more free form: 
it can be channel numbers or intervals of numbers, separated by the blanks or 
commas, the total string to be inserted into the parenthesis (see Section 4.1. of the 
Appendix), for example: 

(1,3 5 10-13 20-23 25) 

5.5.6. The installation of the whole SNDA tree at a user computer is simplified 
and provided by the just one command "tar -xvf\ 
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6. DESCRIPTION OF PROGRAMS FOR 3-COMPONENT ARRAY DATA 

PROCESSING 

6.1. Program “MODELS” 

Modeling of 3-component array seismograms 

The program simulates multichannel seismograms at the output of 3 
component (3C) array. First, let us emphasis that the program does not calculate the 
synthetic seismogram for some given medium model and location of a seismic source. 
It provides much more simple imitation of a multichannel seismogram needed for 
testing of algorithms and programs intended for advanced 3C array data processing. 

At the first step a seismic phase signal is generated with an assigned deterministic 
waveform or a random waveform with given power spectral density. The signal 
corresponds to a recording of a seismic wave phase by a 1C sensor located at the 
point with coordinates (x,y,z)=(0,0,0) and oriented in the direction of the seismic 
phase oscillations. Then using this basic waveform the program calculates the phase 
waveforms for every sensor of a 3C array taking into account the array geometiy, 
direction of plane seismic wave arrival and phase type: P, SH, SV, L & R. 

Let X-axis be directed to the West, Y-axis - to the North and Z - along to 
upper perpendicular to the day suiface. Assume that wave field registered is generated 
by a distant seismic source and thus every seismic wave phase has the plane wave 
front. Suppose that a body wave phase arrives to the day suiface from a laterally 
homogeneous lower half-space and denote a=(ax,ay,a^ T wave-front unit directional 
vector, a and p - the back azimuth and incidence angle of the arrival direction and V 

- the wave velocity just beneath the day surface. Using this notations we have: 

a—(-sin(a)sin($), -cos( a )sin($), cos($)) T . 

Displacements of media particles u(t,r) =(u/t,r),u y (t,r),t,r)) T in a point r=(r x> r y ,r z ) T 
for a given wave phase oscillations can be written as 

u(t,r)=s(t-(r T a)/V) b, 

where s(t) is a scalar phase waveform registered in the coordinate origin; b=(b x ,by,b^ T 

- unit vector reflecting the oscillation direction (polarization) of given wave phase. 

In the frequency domain this equation has the form: 

u(f,r) =s(f)exp(-i2nf(r T a)/V)b 
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For the simplest medium model without consideration of affecting of the day surface 
on the wave field the vector b is expressed by the following simple geometric 
equations: 


for P-wave: 


for SH and Love-waves: 


for SV-wave: 


for Rayleigh wave: 


- sin a • sin p 
bp = - cos a ■ sin P ; 

COS (p p 

cos a 

bp = - sin a ; 

[ 0 

sin a * cos p 
by - cos a-cos $ ; 
5/«P 

-i sin a * sin \j/ 
bp = -i cos a • sin y ; 
cos \j/ 


where \\f=arctg(e), e is Rayleigh wave elliptic coefficient (ratio of the small and large 

axes of the wave polarization ellipse); / = 4 -1 characterizes the n/2 phase shift 

between horizontal and vertical components of Railey phase displacements. 

The model above is the simplest in the sense that it does not take into account 
the reflections and transformations of the different wave phases while arriving to the 
day surface. The more complex but more realistic model of seismic wave propagation 
in a vicinity of the day surface is proposed by D.Kennet. From this model one obtains 
the following equations for vector b: 


for P-wave: 


- sin a * Vp 

bp - -cosa'Vpp h C 2 
VpQpC l 


2 cos a 

for SH and Love waves: b L - -2 sin a 

0 


for SV-wave: 


for Rayleigh wave: 


~sina- V s q s C x 
by = cos a * V s q s Ci 

VsPhC 2 

-i sin a * sin y 
bp = -/ cos a • sin \|/ 
cos \J/ 
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where V p , V s are phase velocities of the P and S waves correspondingly; ph is the 
horizontal apparent slowness of the wave phase; 

q P = (Vp 2 - pi ) l/2 ; q s = (Vs 2 - pi) 1/2 ; 

C 2-Vs 2 -(Vs 2 ~2-p 2 h ) 4 Vs 2 -q P -q s 

1 (Vs 2 -2-pl) 2 +A-pl-q P q s ’ 2 (Vf -2-pi) 2 +A-p\ q P q s 

Besides modeling of a multichannel seismogram at the output of 3C array the 
program can simulate the output of the so-called Strain-Inertial Micro Array (SIMA) 
- seismic field recording installation consisting of a single 3C seismometer and two 
horizontal (E-W and N-S directed strainmeters) deployed in the same site. The 
known relation between recordings dj(t) and u/t), /=x,y of the similarly directed 
horizontal sensors of seismometer and strainmeter: dj(t)~pjUj(t ) 9 where p\ is 
horizontal apparent velocity in the i-th direction, is used for this simulation. 


Input parameters of the program 

All the program parameters are to be contained in a disk file with the name 
“models.inp”. An example of the file is given below. 

*** FILE OF INPUT PARAMETERS FOR PROGRAM "MODELS": standard *** 
INSTALLATION TYPE: 3C ARRAY=1; 1C ARRAY=2; 2C HORIZ. ARRAY=3; SIMA=4 
1 

NUMBER OF SEISMOMETERS 

12 

NAME OF FILE WITH SEISMOMETER COORDINATES 
alibek.crd 

TYPE OF WAVE PHASE: P = 1; SH or LOVE = 2; SV = 3; RAYLEIGH = 4 
1 

RAYLEIGH WAVE ELLIPTIC COEFFICIENT 
0.7 

P- AND S-WAVE MEDIUM PHASE VELOCITIES 
6. 3.5 

NAME OF FILE WITH DISPERSION CURVE (IF NAME= 1 WITHOUT DISPERSION) 

disp.dat 

MEDIUM MODEL: SIMPLEST = -1; KENNET MODEL = 1 
-1 

WAVEFORM TYPE: WHITE UNCORRELATED CHANNEL SIGNALS = -2; WHITE NOISE 

WAVEFORM = -1; BERLAGUE PULSE = 0; AR-PROCESS = 1; PATTERN WAVEFORM = 2; 
1 

10. CENTRAL FREQUENCY FOR BERLAGUE PULSE (HZ) 

0.2 

AR-MODEL ORDER 
4 

INPUT FILE FOR WAVEFORM DESIGN (FOR MODEL TYPE 1 OR 2): 1 *.dat 1 -TIME 

SERIES; 1 *.arc 1 -AR-COEFFICIENTS 

pwform.dat 
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PARAMETERS C & B OF WAVEFORM ENVELOPE: ENV (T) =C*T*EXP (-B*T) (IF C=0 - 

ENV-l) 

10 . 0.002 

ARRIVAL AZIMUTH (DEGREES) 

10 . 

ANGLE OF INCIDENCE (DEGREES) 

20 . 

ONSET TIME OF WAVE PHASE (SEC) 

20 . 

LENGTH OF WAVE PHASE (SEC) 

80. 

OUTPUT DATA SAMPLING INTERVAL (SEC) 

0.05 

POWER OF OUTPUT WAVEFORM 
500. 

WRITE MODE: TO DISK FILE = -1; TO SYSTEM STACK = 1 
1 

NAME OF FILE FOR OUTPUT DATA PARAMETERS (FOR WRITE MODE = -1) 
models.par 

NAME OF FILE FOR OUTPUT DATA SAMPLES (FOR WRITE MODE = -1) 
models.dat 

INITIAL VALUE FOR RANDOM NUMBER GENERATOR 
11 

Explanation of the input parameters 

1. ARRAY TYPE: 3C ARRAY - 1; 1C ARRAY = 2; 2C HORIZ. ARRAY = 3; 
SIMA = 4 

The parameter defines the type of recording installation and can have the values: 

1 - for a 3-component array or single 3-component station (if parameter NUMBER 
OF SEISMOMETERS equal 1); 

2 - for 1-component array consisting of similarly oriented sensors; 

3 - for 2-component horizontal array; 

4- for strain-seismometer microarray (SIMA), consisting of a single 3C seismometer 
and two horizontal strainmeters located at the same site. 

2. NUMBER OF SEISMOMETERS 

The number of 1, 2 or 3-component seismic receivers composing the array; 

3. NAME OF FILE WITH SEISMOMETER COORDINATES 

The name of file with coordinates (in km) of seismic receivers composing the array 
(1C, 2C, 3C seismometers or SIMA) If the number of seismometers is equal 1 then it 
is assumed to be located at the point with coordinates (0,0). The file has to consist of 
two ASCII columns: first one with X-coordinates (East-West orientation) and second 
one with Y-coordinates (South-North orientation). In this version of the program the 
Z seismometer coordinates are assumed to be equal 0. 

4. TYPE OF WAVE PHASE: P = 1; SH or LOVE = 2; SV = 3; RAYLEIGH = 4 
The parameter determines the type of seismic phase to be simulated: (1)- for P-wave, 
(2) - for SH or Love wave, (3) - for SV wave and (4) - for Railegh wave. The given 
version of program generates array recordings for a single wave phase; 

5. RAYLEIGH WAVE ELLIPTIC COEFFICIENT 

This is the elliptic coefficient of the Railegh wave: the ratio of small axis of oscillation 
polarization to the large one; 

6. P- AND S-WAVE MEDIUM PHASE VELOCITIES 

This is the phase velocities of P and S waves in the medium just beneath the array. 
When modeling the Railegh (Love) waves if the parameter “Name of file with 
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dispersion curve” consists of 5 blanks then the P-wave velocity (i.e. the first from two 
assigned here parameters) is used as the surface phase velocity. 

7. NAME OF FILE WITH DISPERSION CURVE (IF NAME— ' - WITHOUT 
DISPERSION) 

The name of file containing the surface wave dispersion curve. If this parameter value 
the name contains blanks at 5 first positions it means that the dispersion is absent and 
the surface wave has to be modeled using the frequency independent phase velocity. 
The file has to consist of two ASCII format columns: the frequency in Hz and 
velocity in Km/Sec; 

8. MEDIUM MODEL: SIMPLE = -1; KENNETH MODEL = 1 

The “medium model” = -1 is used for the simple laterally homogeneous model 
without accounting for the day surface affect on the wave arrival parameters; the 
“medium model” = 1 is used for the Kennet model accounting for the frequency 
independent wave reflections and transformation at the day surface. 

9. WAVEFORM TYPE: WHITE UNCORRELATED CHANNEL SIGNALS = -2; 
WHITE NOISE WAVEFORM = -1; BERLAGUE PULSE = 0; AR-PROCESS = 1; 
PATTERN WAVEFORM = 2; 

The parameter determines a type of phase waveform: 

(-2) - in this case all channel wavefonns are modeled as realizations of independent 
white Gaussian random processes with zero mean and equal dispersions (defined by 
the value of parameter “Power of output waveform”); 

(-1) - in this case the waveform of wave phase oscillations is assumed to be a white 
Gaussian time series, the channel waveforms are generated on this basis using array 
configuration, wave type, wave arrival direction and wave velocity in the medium. 

0 - in this case the phase waveform has the shape of the Berlage pulse: 
u(t)=t a exp(-2jifobt)sin(2Tcfot), where fo is the central frequency of the waveform 
spectrum; a,b are the parameters of pulse shape; 

(1) - in this case the phase waveform is modeled as an autoregressive random process 

p 

(with assigned power spectrum) using the equation: u(t) = ^ a(k)u(t - k) + 0 2 e(t), 

k=l 

where a(k), k=l,...,p are the autoregression coefficients, p is the autoregression order, 
e(t) is the innovation whiter random time series with zero mean and unit dispersion, 
cr 2 is the dispersion of autoregression residuals; 

(2) - in this case a pattern time series stored in the special disk file is used for 
simulation of phase waveform to be modeled. 

10. CENTRAL FREQUENCY FOR BERLAGUE PULSE (HZ) 

If the phase waveform is modeled as the Berlage pulse the parameter defines the pulse 
central frequency. 

11. AR-MODEL ORDER 

If the phase waveform is modeled as autoregression (AR) process this parameter 
defines the order of autoregression. 

12. INPUT FILE FOR WAVEFORM DESIGN (FOR MODEL TYPE 1 OR 2): 
'*.dat'-TIME SERIES; **.arc'-AR-COEFFICIENTS 

The parameter defines the name of file used when MODEL TYPE value is equal 1 or 
2. In the first case the file has to have the extensions “.dat” or “.arc” The file with 
extension “.dat” have to contain some pattern waveform (time series) which is 
approximated by an AR-process with given order p. The autoregressive coefficients 
calculated in result of this approximation are then used for phase waveform 
simulating. This provides the opportunity to generate a great number of statistically 
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independent seismograms with a given power spectrum that is necessary for Monte 
Carlo investigation of array data processing algorithm characteristics. 

The file with the extension “.arc” have to contain AR coefficients for generating of 
phase waveform to be modeled. 

In case when WAVEFORM TYPE = 2 the file can have an arbitraiy name and has to 
contain the pattern phase waveform (usually originated from real seismic recording) 
which must be used for generating of a multichannel array seismogram. 

In any case file has to contain one ASCII format column: N waveform 
samples or p AR coefficients 

13. PARAMETERS C & B OF WAVEFORM ENVELOPE: ENV(T)=C*T*EXP(- 
B*T) (IF C=0 => ENV=T) 

These are the parameters of the phase waveform envelope. For all values of parameter 
MODEL TYPE excluding 0 (Berlage pulse) and 2 (pattern time series) the waveform 
being generated is multyplied with the function x(t)=Ct-exp(-Bt) determining the 
waveform envelope. If C=0 then x(t)=l. 

For MODEL TYPE=0 these parameters define the coefficients of Berlage pulse, for 
MODEL TYPE=2 the parameters are not used. 

14. ARRIVAL AZIMUTH (DEGREES) 

This parameter defines the arrival back azimuth of modeled wave to the receiving 
installation (for direction from the receiver to a seismic source). The azimuth is 
counted clockwise from the Y (South-North) axis. 

15. ANGLE OF INCIDENCE (DEGREES) 

This is the incidence angle of modeled wave arrival to the receiving installation. The 
angle is counted from the vertical to the day surface. 

16. ONSET TIME OF WAVE PHASE (SEC) 

This is the onset time of the wave phase in the point with the coordinates (0,0). It is 
assumed that the first sample of the seismogram to be modeled corresponds to the 
time moment equal to zero. From this moment to the phase onset time the modeled 
seismogram has the zero values. 

17. LENGTH OF WAVE PHASE (SEC) 

This is the length (in sec.) of the wave phase to be modeled. This parameter jointly 
with the phase onset time defines the total length of the simulated seismogram. 

18. OUTPUT DATA SAMPLING INTERVAL (SEC) 

The parameter determines the sampling interval of the seismogram being simulated. 

19. POWER OF OUTPUT WAVEFORM 

The parameter determines the averaged power of the seismogram being simulated. 

20. WRITE MODE: TO DISK FILE = -1; TO SYSTEM STACK =1 

The parameter determines a device for saving simulated multichannel seismogram: 
(-1) if it is a disk file, (1) - if it is the SNDA System stack 

21. NAME OF FILE FOR OUTPUT DATA PARAMETERS (FOR WRITE MODE 

= - 1 ) ... 
If the simulated multichannel seismogram has to be written to the disk file this 
parameter determines the name of file for saving the seismogram parameters: number 
of channels, number of samples in each channel, sampling interval (in sec) These 
values are saved in the ASCII format row. 

22. NAME OF OUTPUT DATA FILE (FOR WRITE MODE = -1) 

If the simulated multichannel seismogram has to be written to the disk file this 
parameter determines the name of file for saving of the seismogram sample data. The 
data is written in the ASCII format multiplex form without any header. 
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23. INITIAL VALUE FOR RANDOM NUMBER GENERATOR (MODEL 
TYPE=(-2), (-1), (1)) 

The parameter determines the initial value (integer) for starting the random number 
generator used for a modeling stochastic white time series. 

Output data format 

The program output consists of N-channel array seismogram. The number of 
channels depends on chosen array configuration. The general ordering of the output 
traces for every recording installation (1C, 2C, 3C seismometers ar SIMA) composing 
the array is the following: 1) Strainmeter S-N, 2) Strainmeter W-E, 3) Seismometer 
S-N, 4) Seismometer W-E, 5) Seismometer Z. Some of the above traces can be 
absent in the simulated multichannel seismogram that depends on the program input 
parameters for the array configuration. The ordering of the seismometer 1, 2 or 3- 
component outputs is determined by the file of array seismometer coordinates. If 
simulated seismogram is written to the SNDA Stack (parameter WRITE MODE=T) 
all traces are placed to the beginning of the stack. 

6.2. Program “POL” 

Polarization filtering of data from single 3-component station by Flinn method 

The program realizes the classic Flinn algorithm of polarization filtering 3 
component (3C) data. Its basic idea is utilization of the linear polarization feature of 
P, SH and SV waves while propagating trough layered laterally homogeneous medium 
and possibility to transform elliptic polarized Railey type wave recordings to the 
quasi-linear polarized 3c time series. 

A moving time window consisting of N data samples is chosen inside the time 
interval under investigation. A length o f the window is to meet the following two 
restrictions: 1) Number of samples N should be sufficiently large to provide the 
statistically reliable suppression of seismic noise interfering with the signal waveform. 
2) The window length should not exceed duration of the seismic wave phases with 
different polarization and the intervals between these phases: if two such phases are 
contained in the same time window it would not be possible to distinct this waves by 
any polarization filter. Thus for successful polarization filtering the compromise has to 
be achieved between resolving power and reliability of extraction of the waves with 
different polarization. 

The 3x3 covariance matrix C of the 3-component data within chosen time 
window is estimated and its eigen values kj and eigen vectors e\ y are 

calculated. Using these values weight functions F and D characterizing a linearity 
quality and direction of wave phase polarization are evaluated. The function F is the 
same for the longitudinal and transverse waves and equal F = (1- (k]/k2) m ) n > where 
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Xj and X 2 are the largest and a second in the magnitude eigen values of the power 
spectrum matrix C. The function D is calculated for each type of wave phases P, SH 
and SV using different formulas: 

Dp = (ej T Up) l } D S h = (ei T UsH) l > Dsv = (ei T Usv) 1 - 

Here ej is the eigen vector corresponding the largest eigen value Xj ; Up is a unit 
column vector normal to the assumed phase wave front positioning; Use, Usv are th e 
unit column vectors orthogonal to the U p and belonging to the horizontal and vertical 
planes correspondingly; exponent powers n, m and / determine the order of 
unlinearity of the polarization filter and are chosen empirically. 

Magnitude of weight function F is equal 1 for purely linearly polarized wave 
phase and diminishes up to zero while increasing of ellipticity of wave polarization up 
to pure spherical one. The function Dp emphasis phase oscillations in the current 
time window if direction of eigen vector ej is close to assumed phase arrival 
direction, functions Dsh and Dsv do so if ej is orthogonal to this direction. 

The following three values are calculated for eveiy current position t of the 
time window middle point: 

P, = F t Dp,(Xt T Up); SH, = F, D SH t(Xt T U S H); SV t = F, Dsv,(X t T U S y); 

A user can perform polarization filtering in two assumptions: a location of the 
seismic source is known or unknown. For the first case it is possible to calculate the 
vectors Up y Use, Usv an d then to accomplish the filtering using given vector values. 
For the second case vector Up should be estimated from observations. As a rule the 
value of eigen vector ej corresponding to a maximum magnitude of the weight 
function F is used as such estimate. It is justified by the fact that maximum 
polarization linearity of the 3C obseivations coincide with an arrival of the P-phase. 
However for small signal to noise ratio the maximal F value can be connected with a 
S-phase arrival. This leads to mistake in identification of P and SV phases. So for a 
weak events one should use information about P-phase onset time and seek a 
maximum of the weight function F just after this moment. Estimated value of Up 
allows to calculate Use, Usv and then perform the polarization filtering. 

Input parameters of the program 

All input parameters of the program have to be contained in the file “pol.inp”. 
Example of the file is given below: 

***** FILE OF INPUT PARAMETERS FOR PROGRAM "POL": standard***** 
READ/WRITE MODE: “1 - FROM/TO DISC FILE; 1 - FROM/TO SYSTEM STACK 
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1 1 

CHANNELS FOR PROCESSING (FOR READ MODE = 1) 

3 

PROCESSING MODE: 0-GIVEN AZIMUTH AND INCIDENCE; 1-MAXIMUM LINEARITY; 
2-MAXIMUM LINEARITY IN CURRENT WINDOW; 3-SCANNING THROUGH AZIMUTH; 
4-SCANNING THROUGH INCIDENCE 
2 

BACK-AZIMUTH FROM STATION TO THE SOURCE, DEGREES 
71.22 

INCIDENCE ANGLE (FROM VERTICAL) OF WAVE ARRIVAL, DEGREES 
20 

LENGTH OF TIME WINDOW FOR FILTER ADAPTATION, SEC 
4. 

DELAY OF TIME WINDOW FROM STAR:T POINT OF TRACES, SEC 
176. 

LINEARITY ORDER 1 
1 

LINEARITY ORDER 2 
2 

DIRECTION ORDER 
2 

NUMBER OF RAYS (NR<=100) 

20 

NAME OF DATA PARAMETER FILE (FOR READ MODE = -1) 
data.par 

NAME OF DATA FILE (FOR READ MODE = -1) 
data.dat 

NAME OF OUTPUT FILE (FOR WRITE MODE = -1) 
out.dat 

Explanation of the input parameters 

1 . READ/WRITE MODE: FROM/TO DISC FILE = -1; FROM/TO SYSTEM 
STACK =1; 

The parameter determines the location of input data and the device on which the 
processing results are stored: -1 is a disk file, 1 is the stack of the SNDA System; 

2. CHANNELS FOR PROCESSING (FOR READ MODE = 1) 

If data are read from the SNDA stack this parameter assigns the numbers of channels 
chosen for the processing 

3. PROCESSING MODE: 0 - GIVEN AZIMUTH AND INCIDENCE; 1 - 

MAXIMUM LINEARITY; 2-MAXIMUM LINEARITY WITH GIVEN TIME; 3 - 
SCANNING THROUGH AZIMUTHS; 4 - SCANNING THROUGH 

INCIDENCE ANGLES 

The parameter determines a mode of the filtering: 

0 means that the normal vector to the P-wave front is known; 

1 means that the nonnal vector to the P-wave front is detennined in the 
program based on maximum of polarization linearity inside of a total data interval 
being processed 

2 means that the nonnal vector to the P-wave front is detennined in the 
program based of polarization inside of a assigned time window position : 

3 means that the scanning through arrival azimuths is performed (for assigned 
incidence angle) to estimate the anival direction 

3 means that the scanning through arrival incidence angles is performed (for- 
assigned azimuth) to estimate the arrival direction 
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4. BACK-AZIMUTH FROM STATION TO THE SOURCE (DEGREES) 

The parameter assigns a value of arrival back-azimuth for using in 0 and 3 filtering 

mode. 

5. INCIDENCE ANGLE OF WAVE ARRIVAL (FROM VERTICAL, DEGREES) 
The parameter assigns a value of arrival incidence angle for using in 0 and 4 filtering 

mode 

6. LENGTH OF TIME WINDOW FOR FILTER TEACHING, SEC 

A length of moving time widow for calculation of a filter output value (sec) 

7. DELAY OF TIME WINDOW FROM START POINT OF TRACES, SEC 

Time from data start to the start of time window for determination of P-wave arrival 

direction 

8. LINEARITY ORDER 1 

The power m of exponent for calculation of weight function F 

9. LINEARITY ORDER 2 

The power n of exponent for calculation of weight function F 

10. DIRECTION ORDER 

The power l of exponent for calculation of weight function D 

11. NUMBER OF RAYS FOR SCANNING (<=100) 

The number of rays used for scanning arrival directions in 3 and 4 filtering modes 

12. NAME OF DATA PARAMETER FILE (FOR READ MODE =-1) 

The name of disk file containing the parameters of data for processing: a number of 
channels, a number of channel data points, data sampling interval ( 

13. NAME OF DATA FILE (FOR READ MODE =-1) 

The name of disk file containing data to be processed. Data have to be written in 
ASCII format in multiplex form without any header. 

14. NAME OF OUTPUT FILE (FOR WRITE MODE - -1) 

The name of output disk file for saving of filtering results. The output data are written 
in ASCII format in the multiplex form without any header 

Input data format 

The program performs polarization filtering of data recorded single 
3-component station. The 3 channel data to be processed have to correspond the SN, 
EW and Z sensor components. Mixing up the channel order does not cause the 
failure of the program but leads to wrong results. The channel number exceeding 3 is 
not permissible. The number of data samples has to be the same in eveiy channel and 
less then 10000. 

Output data format 

The number and types of the program output traces depends on PROCESSING 
MODE parameter value. When this parameter is equal 0, 1 or 2 there are 8 traces: 

1) Resulting trace for filtering in the current polarization direction (main eigen vector 
direction) of 3C data being processed; 

2) Resulting trace for filtering in P-wave oscillation direction; 

3) Resulting trace for filtering in SH-wave oscillation direction; 

4) Resulting trace for filtering in SV-wave oscillation direction; 

5) Polarization linearity weights F t ; 

6 ) Directional weights D{, 

1) Azimuths of the main axis (eigen vector) of a current data polarization; 

8) Angle of incidence of the main axes (eigen vector) of a current data polarization. 

When PROCESSING MODE parameter is equal to 3 there are N output 
filtered traces (where N equal to value of parameter NUMBER OF RAYS FOR 
SCANNING) corresponding to different azimuths (in the range of 0,360 degrees) and 
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incidence angle assigned by parameter INCIDENCE ANGLE OF WAVE ARRIVAL. 
When PROCESSING MODE parameter is equal 4 there are N output filtered traces 
corresponding different incidence angles in the range of (0,90) degrees and azimuth 
assigned by parameter BACK-AZIMUTH FROM STATION TO THE SOURCE. 

6.3. Program “POLCFLTS” 

Vector polarization filtering of multichannel data 

The program realizes the method of multichannel data polarization filtering, 
put forward by Dj.Samson and Dj.Olson. The most known from literature polarization 
filtering algorithms have a vector frequency response: they use as an input a 
multidimensional time series and produce at the output a single trace which is a result 
of polarization filtering in the direction of the main axis of data covariance matrix. 
Dj.Samson and Dj.Olson proposed the alternative approach when polarization filter 
has a scalar frequency response and produces the same number output traces as it has 
at the input: 

N -1 

Y(t)='£X(t-i)-H(i) 

i« 0 

where X(t) =[Xj(t),X^t),1) ] T - vector input time series of the filter; 
Y(t)= [Yj(t), Y/t),..., Y^/t)J T - filter output vector time series having the same 
dimension as the input one; h(t) - scalar frequency response of the filter; N - length 
of time window used for calculation of the filter output for a current time moment t; 
M - number of data channels. 

The major information needed for constructing of filter frequency response 
H(t) is frequency dependent characteristic of polarization of the input signal X(t). 

Let us consider the smoothed estimation of matrix power spectral density (MPSD) of 
the input signal X(t): 

k 2 

C(f, 5;= ^a(k) ■ Z(k)Z*(k) 

k=K { 

N -1 

where Z(k) = \x(t)- exp(-i2nkt) - is the Fourier transform of the input signal; 

f=0 

k-0,l,...,N-l; (KM is the interval for smoothing through frequencies; 
f-( K]+K2>/(2NAt); 8 =( K 2 - K])/( NA1); At is the data sampling inteival; * is the 
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symbol of Hermitic conjugation; ci(k) is a smoothing window, satisfying the 

*2 

condition: ^a(k) = 1. 

k-Ki 

Number of freedom degrees v of smoothed estimation C(f) of data MPSD is 


equal: 


v = 


k 2 

k=K x 


The measure P(f) of polarization of the input multichannel data X(t) is 
determined by the spectrum of matrix C(j) eigen values (for purely polarized X(t) 
there exist only one nonzero eigen value). A computationally efficient measure P(f) 
can be derived from the characteristic polynomial of the matrix C(f) and has the 
form: 


(M-l )• (tr(C)) 2 

where tr(C) is the track of matrix C (dependence on P and C from / is omitted for 
simplicity). 

As the polarization filter frequency response one may choose some proper 
function from polarization measure P(f)- It is convenient to use for a such function 
h(f)=PZ(f) where g is some positive quantity, determining the degree of unlinearity of 
the filter. An increasing of g leads to a more severe suppression of the input data at 
intervals where data contains strong unpolarized components (e.g. have small signal to 
noise ratio) 

The algorithm of vector polarization filtering of multichannel data can now be 
written as following: 

AM 

Y(t) - N~ l Z(k ) P g - exp(l2nkt) , t=0,1. 

k-0 

Mean value and dispersion of the estimated polarization characteristic P(f) 
strongly depend on v - freedom degrees number of MPSD estimation C(f). With 
increasing v the mean value of P(f) trends to its truth value and the dispersion 
decreases. However, using a large values of v is not expediently because this implicate 
a bad frequency resolution of the polarization filtering. The number of channels M 
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strongly affects to dispersion of P(f) and for large M it can be recommended to 
increase the filter unlinearity order g. 

The program being described implements the above algorithm for the case of a 
moving time window. This variant of filter realization is most expedient for processing 
of nonstationary time series such as multichannel seismic recordings. In this case the 
data polarization characteristics can significantly change along the time interval being 
processed. Note that the given program version uses the rectangular time windows 
(with number of freedom degrees equal TV) which are not overlapping for the sake of 
saving computational resources. 

Input parameters of the program 

All input parameters of the program have to be contained in the file 
“polcflt.inp”. Example of the file is given below: 

***FILE OF INPUT PARAMETERS FOR PROGRAM "POLCFLTS": standard *** 
READ/WRITE MODE: FROM/TO DISK FILE =-l; - FROM/TO SYSTEM STACK = 1 
1 1 

ORDER OF POLARIZATION FILTER 
3 

LENGTH OF MOVING WINDOW FOR PROCESSING (IN SEC) 

50 

NUMBER OF FREEDOM DEGREES 
6 

DATA CHANNELS FOR PROCESSING (FOR READ MODE = 1) 
all 

DATA PARAMETER FILE NAME (FOR READ MODE = -1) 
data.par 

DATA FILE NAME (FOR READ MODE = -1) 
data.dat 

NUMBER OF FIRST POINT FOR PROCESSING (FOR READ MODE = -1) 

1 

NUMBER OF DATA POINTS FOR PROCESSING (FOR READ MODE = -1) 

1000 

NAME OF OUTPUT FILE (FOR WRITE MODE = -1) 
out. dat 

Explanation of the input parameters 

1. READ/WRITE MODE: FROM/TO DISK FILE =-l ; FROM/TO SYSTEM 
STACK =1 

The parameter determines the location of input data and the device on which the 
processing results are stored: -1 is a disk file, 1 is the stack of the SNDA System; 

2. ORDER OF POLARIZATION FILTER 

The parameter determines the unlinearity feature of the filter: power g of the exponent 
pg; 

3. LENGTH OF MOVING WINDOW FOR FILTERING (IN SEC) 

The length of moving time window for data processing (sec); 

4. NUMBER OF FREEDOM DEGREES 
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The number of freedom degrees for the estimate of matrix power spectral density C(f). 
The parameter determines the length of triangular smoothing frequency window 

5. DATA CHANNELS FOR PROCESSING (FOR READ MODE = 1) 

If data are read from the SNDA stack this parameter assigns the numbers of channels 
chosen for the processing; 

5. NAME OF DATA FILE (FOR READ MODE = -1) 

The name of disk file containing data to be processed. Data have to be written in 
ASCII format in multiplex form without any header. 

6. NAME OF FILE WITH DATA PARAMETERS (FOR READ MODE = -1) 

The name of disk file containing the parameters of data for processing: a number of 
channels, a number of channel data points, data sampling interval: 

7. NUMBER OF FIRST POINT FOR PROCESSING (FOR READ MODE = -1) 

If data are read from disk file this parameter defines the number of a first sample of 
data interval chosen for the processing. The preceding samples are omitted 

8. NUMBER OF DATA POINTS FOR PROCESSING (FOR READ MODE = -1) 
If data are read from disk file this parameter defines the length of data interval for 
processing (in samples). This interval is then divided to some number of 
unoverlapping time windows which length is assigned by the parameter LENGTH OF 
MOVING WINDOW FOR FILTERING. 

9. NAME OF OUTPUT FILE (FOR WRITE MODE = -1) 

The name of output disk file for saving of filtering results. The output data are written 
in ASCII format in the multiplex form without any header 

Input data format 

As the input data of the program one may use no more then 50 time series 
with the same number of samples in every series not exceeding 4096 ones. An 
ordering of the input channels is not meaningful for the program. 

Output data format 

In result of program performing one gets M+1 traces. The first M traces are 
the filtered time series (there are the same number of traces and the same channel 
ordering as for the input data). In the last output trace the averaged through 
frequencies filter response PS is presented for each moving time window inside the 
total interval of processed data. If program output is put into the SNDA stack the all 
traces is placed at the end of stack. 

6.4. Program “ARMAFS” 

Estimating of inverse matrix power spectral density of multichannel data 

by ARMA modeling 

The program is designed for estimating of inverse matrix spectral density of 
multichannel data. The estimating is performed by multidimensional autoregressive- 
moving averaged (ARMA) modeling of the multichannel time series 
x(t) = (xf t),xf t),...t)) T that means the approximation of x(t) by the random 
process y(t) satisfying to the following equation 

p Q 

y(t) = £ A(l)y(t - 1) +£ B(l)e(t-I), t—],...,N 
/= 1 /=0 
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where N is the length of time series being modeled; P is the order of autoregression 
(AR) part of the model; Q is the order of moving averaged (MA) pail of the model; 
A(l), fe 1,P are the MxM matrices of AR parameters; B(l), le 1, Q are the MxM 
matrices of MA parameters; M is the dimension of time series; e(t) is the M- 
dimensional Gaussian white time series with the zero mean and unit covariance 
matrix. The procedure for calculation of parameters A(l) and B(l) is chosen to provide 
the good adjusting of the model y(t) to the observations x(t) under the condition of 
saving the computational time. 

The estimate F~ 1 (f) of inverse matrix power spectrum density of the 
multichannel observations is evaluated from A(l) and B(l) estimates using the 
equation: 


F 


-l 


( p > 

^ A(k)- exp(ilitkf ) 

> 




B(k)- exp( i.2nkf ) 


\k=i 


J 


( P \ 

^ A T (k)- exp(-i2nkf ) 
l 


Input parameters of the program 

All input parameters of the program have to be contained in the file 
“armafs.inp”. Example of the file is given below: 

*** FILE OF INPUT PARAMETERS FOR PROGRAM ARMAFS: standard*** 

ADAPTATION MODE: ARWF = -1; ARLD = -2; MA = -3; ARMA = -4 
—4 

ORDER OF MULTIDIMENSIONAL AUTOREGRESSIVE MODEL 
10 

ORDER OF MULTIDIMENSIONAL MOVING—AVERAGE MODEL 
10 

VALUE OF A REGULARIZATOR FOR MATRIX AUTOCOVARIANCE FUNCTION 
0.0001 

LOW & HIGH FREQUENCIES OF FREQUENCY RANGE 
0. 5. 

FILTERING MODE: WITH INTERPOLATION = 1, WITHOUT = -1 
1 

NUMBER OF FREQUENCIES (IF FILTERING MODE IS = 1) 

128 

NAME OF OUTPUT FILE FOR INVERSE MATRICES 
insp.mtrs 

NAME OF OUTPUT FILE FOR INVERSE MATRIX PARAMETERS 
insp.par 

READ MODE: FROM DISK FILE = -1; FROM SYSTEM STACK = 1 
1 

DATA CHANNELS TO BE PROCESSED (IF READ FROM DISK FILE THEN AT THE END OF 
LIST MUST BE 0) 

25 

NAME OF DATA FILE (FOR READ MODE = -1) 
fchn.dat 
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NAME OF DATA PARAMETER FILE (FOR READ MODE = -1) 
fchn.par 

Explanation of input parameters 

1. ADAPTATION MODE: ARWF = -1; ARLD = -2; MA = -3; ARMA = -4 
The parameter determines the type of mulichannel data model: 

(-1) means the data modeling by the autoregressive (AR) process with the order P 
(Q-0)\ the AR matrix coefficients A(l), 1=1,...,P are calculated with the help of 
copmputational efficient Levinson-Durbin procedure, the covariance matrix of 
residuals B(0) are estimated as covariance matrix of an output signal of the 
mutichannel whitening filter: 

i N £ 

B(0) = — ^y(t)y*(t) , where y(t) = 2 _ j A(k)x(t-k). 

(-2) - the data are also modeled by AR process with the order P (Q=0), but in 
distinction with the (-1) mode the all AR model matrix parameters: A(l) and B(0), are 
estimated by the Levinson-Darbin procedure. This is less time consuming in 
comparison with (-1) mode, but for strongly coherent data can lead to computational 

instability. 

(-3) means the data modeling by the moving average (MA) process with the order Q 
( P=0 % MA matrix coefficients B(l), 1—0,...,Q are calculated by the equation: 

B(l) = ^ ^ x( t)x V t-l) . 

t =l 

(-4) means the data modeling by the autoregressive-moving average (ARMA) process 
with the orders P and Q\ The AR matrix coefficients A(l), 1=1,...,P are calculated with 
the help of multichannel Levinson-Darbin procedure, MA matrix coefficients B(k), 
k=0,...,Q are estimated as the matrix covariances with lags 0,...,Q of output process of 
the multichannel whitening filter: 

i N £ 

B(k) = ^-^y(t)y*(t-k), k=0,...,Q, where y(t)= 2_ j A(k)x(t - k). . 

N , =1 k =l 

2. ORDER OF MULTIDIMENSIONAL AUTOREGRESSIVE MODEL PART 
The order of model AR part has not to exceed 50. 

3. ORDER OF MULTIDIMENSIONAL MOVING-AVERAGE MODEL PART 

The order of model MA part has not to exceed 50. 

4. VALUE OF A REGULARIZATOR FOR MATRIX AUTOCOVARIANCE 
FUNCTION 

If multidimensional time series being modeled has high correlations between 
component processes then it is expedient to add to the diagonal elements Ba(0) of the 
estimated data covariance matrix some regularization quantities by the formula: 
B ’u(O) = Bjj(0)(l+REG), where REG is the regularization parameter with 

recommended values: 0.01 - 0.0001 

5. LOW & HIGH FREQUENCIES OF FREQUENCY RANGE 

These two parameters assign the frequency inteival (fiJh) i n which margins the data 
inverse matrix power spectral density (MPSD) is calculated. 

6 . FILTERING MODE: WITH INTERPOLATION = 1, WITHOUT - -1 

If this parameter is equal 1 (with interpolation) then data inverse MPSD is calculated 
for equidistant grid in the inteival (fifh)’, number of the grid points is determined 
by the parameter NUMBER OF FREQUENCIES. 
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If this parameter is equal -1 (without interpolation) then the number of the 
grid points is determined by the Nf =2 < N^/2 - the greatest power-of-two value 
which is less then half of the eveiy channel data samples. In both cases it has to be: 
N f < 257 

7. NUMBER OF FREQUENCIES (IF FILTERING MODE IS 1) 

This parameter determines the number of frequencies for inverse MPSD calculation 
in the case where parameter FILTERING MODE=l. The parameter value has to be 
less then 257. 

8 . NAME OF OUTPUT FILE FOR INVERSE MATRIX 

This is the name of dick file in which the values of calculated inverse MPSD is to be 
stored. 

9. NAME OF OUTPUT FILE FOR INVERSE MATRIX PARAMETERS 

This is the name of disk file in which the parameters of calculated inverse MPSD are 
to be stored 

10. READ MODE: FROM DISK FILE = -1; FROM SYSTEM STACK = 1 

This switch allows to choose the device to read the input data for processing: (-1) 
means that data are read from a disk file, (1) - 

11 . DATA CHANNELS TO BE PROCESSED (IF READ FROM DISK FILE - 0 - 
END OF LIST) 

The parameter determines the channel numbers of multidimensional time series to be 
processed. If the data are read from the SNDA Stack the parameter value must be a 
string having one of the following shapes: 15; (1,3-10,14); all (in accordance with 
CHANNELS format in the SNDA System stack command parameters). If the data 
are read from a disk file the parameter value is a column of channel numbers ended 
by 0. In both cases the total number of channels choosed has to be less then 50. 

12. NAME OF DATA FILE (FOR READ MODE -1) 

If parameter READ MODE = -1, then the name of file containing the samples of 
data to be processed is assigned here. The data must be in the ASCII multiplex 
format without any header. 

13. NAME OF DATA PARAMETER FILE (FOR READ MODE -1) 

If parameter READ MODE = -1, then the name of file containing the parameters of 
input data is assigned here. The parameters are: number of data channels, number of 
samples in eveiy channel and the data sampling interval; the file must have the form 
of ASCII string. 

Input data format 

The multichannel time series which can be processed by the program have to 
consist of no more then 50 channels with no more then 4096 samples in eveiy 
channel. 

Output data format 

In result of program performing the two files are created: first contains the 
inverse MPSD matrices of the input data calculated for the given set of frequencies, 
second contains the parameters of the data and MPSD estimate calculated: the 
number of channels processed, type and orders of the ARMA model, and so on. The 
both files are written in the ASCII format. 
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6.5. Program “GRFILTFS” 

Optimal Wiener group filtering of 3-component array data for different types of wave 

polarization 

The program is designed for performing of optimal Wiener group filtering of 
the multichannel seismic recordings with the puipose of extraction from coherent 
seismic noise of “useful” seismic wave phase waveforms. The wave phase in assumed 
to arrive from definite direction and be polarized in accordance with one of the 

following polarization types: P, SH (or L), SV and R. 

Let the coordinate system has the X-axis directed to the East, Y-axis - to the 
North and Z-axis to the Zenith. Assume that the plane seismic wave phase arrives to 
recording site from direction being determined by the azimuth a, incidence angle p 
and has the phase velocity V. Designate as s(t) the waveform of the seismic phase in 
the coordinate origin r $=(0,0,0). Then the vector signal at the output of the 
3-component array can be written in the frequency domain as 

u(j)—h(f)s(f); h(J)=(h 1 (f),...,h 3 M(f)) r = <p(f)®b; 
where <$(f)^{exp(-i2Kf(ri T a)/V), i=l,...,M} is the M- dimensional column vector of the 
signal phase delays in the array seismometers; <E> is the Kronecker product of (p and 
3-dimensional vector b; s(f) is the Fourier transform of the signal s(t); h(f) is the 
vector frequency response of the seismic wave propagation paths from the coordinate 
origin roto the array sensors; b is the vector reflecting the polarization features of the 
seismic wave phase; a^(sina^sin^, cosol- sin^,co^) T is the unit vector orthogonal to the 
phase wave front. 

The vector b depends on the wave phase type (P, SH, SV Love or Rayleigh) 
and for the simplest laterally homogeneous medium model without accounting for the 
day surface affecting to the wave propagation has the form 

- sin a - sin (3 

For P-wave: bp - -cos a • sin (3 

COStyp 

cos a 

For SH and Love waves: b L = -sina 

L 0 

sin a • cos [3 

For SV-waves by = cos a • cos p 

sin p 
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For Rayleigh waves: 



-/ sin a ■ sin y 
-/ cos a • sin y 


cos y 


In the practical applications the seismic signal is as a rule recorded on the 
background of the additive seismic noise. So the multichannel recordings of the 
3-component array can be expressed by the following equation: u(f)=h(f)'S(f)+ ^(f) 
where X^(f) is the Fourier transform of array seismic noise recordings. 

The optimal undistoiting group filter (OUGF) is known as the filter providing 
maximum suppression of a seismic noise containing a coherent components and 
extraction without distortions a waveform of the seismic phase arriving from the 
known direction. Its frequency response is equal: ®^=F I tf)h(f)(h*(J)F 1 (Oh(f))~ 1 

where F(f)=E{t t (f){ s *(f)} is matrix power spectral density (MPSD) of the array noise 
recordings. The restriction on the OUGF frequency response to not distort the 
waveform of phase arriving from the given direction with given polarization leads to 
the following equation: <&i*(f)h(f)^l 

In more common case it is possible to impose arbitrary restrictions on the 

}/g 

group filter frequency response: O H—c, where the matrix H specifies restriction types 
and vector c defines the desired filter response under these restrictions. In this version 
of the program the variant proposed by J. Classen is realized where the following three 
restrictions are imposed: one for the response amplitude to be unit for the assigned 
arrival direction and other two - for the x,y-spatial derivations of the response 
function which have to be equal zero for the same direction: 

H=[h , dh/dp x , dh/dpy] 

c=(l,0,0) T 

where p x =sina-sin$/V is apparent slowness of the wave phase in the X-axis direction; 
Py-cosa-sin^/V is the same for the 7-axis. 

The group filter frequency response in this case has the form 

<*>#= F- l m<D(B*(f)F 1 (Dmy 1 c 

We call such filter as optimal undistoiting constrained group filter (OUCGF) 


One can easily verify that the noise power spectral density at the output of 
OUGF <f> j(f) equal to o 2 (j)=(h*(J)F 1 (f)h(f))' 1 \ Thus the group filter with the 

frequency response < ^-(f)—F 1 (f)h(f)(h*(f)F 1 (f)h(j)y^ 2 produces at the output the 
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noise with unit power spectral density. We call such filter as the optimal whitening 
group filter (OWGF) 

If the seismic signal is recorded at a background of transient interference waves 
generated by the spatially localized source with known parameters, then the inverse 
noise MPSD can be found analytically: F~ I (f)=[I-q(q*q)~ 1 q*], where / is the identity 
matrix, q(f) is the vector frequency response of propagation paths of the interfering 
wave while it propagates from the noise source to the array sensors. For the plane 
interfering wave the vector q(f) has the same structure as the vector h(f). The 
undistorting optimal group filter designed for this form of MPSD is called the 
spatially rejecting filter. 

Input parameters of the program 

All input parameters of the program have to be contained in the file 
“grfltfs.inp”. Example of the file is given below: 

*** FILE OF INPUT PARAMETERS FOR PROGRAM GRFLTFS: standard *** 

ARRAY TYPE: 3C ARRAY=1; 1C ARRAY=2; 2C HORIZONTAL ARRAY=3; SIMA=4 

2 

FILTER TYPE: UNDISTORTING(1)=1; UNDISTORTING(2)=2; WHITENING=3 

1 

PHASE TYPE: P = 1; SH & LOVE =2; SV = 3; RAYLEIGH. =4 
1 

INITIAL AND FINAL POINTS FOR SCANNING AT AZIMUTH (DEGREES FROM NORTH) 

0. 350. 

INITIAL AND FINAL POINTS FOR SCANNING AT INCIDENCE ANGLE (DEGREES FROM 
VERT.) 

0. 90. 

INCREMENTS FOR SCANNING (DEGREES) 

10. 10. 

RAYLEIGH WAVE ELLIPTIC COEFFICIENT (HORIZ/VERT) 

0.8 

MEDIUM PHASE VELOCITY (KM/SEC) 

5. 

NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR SURFACE WAVE (IF 
NAME=" " - WITHOUT DISPERSION) 

LOW AND HIGH FREQUENCIES OF FILTERING RANGE (HZ) 

0. 5. 

NUMBER OF FREQUENCY BANDS 

1 

MARGINS OF BANDS (HZ) 

1. 2. 3. 4. 

NOISE MATRIX TYPE: IDENTICAL = -1; FOR REJECTION FILTER=0; ADAPTIVE=1 
1 

NOISE WAVE DIRECTION (AZ. & INC. ANGLE), VELOCITY & WAVE TYPE FOR 

REJECTION FILTERING 
340. 75. 3. 1 

NAME OF FILE WITH ARRAY SEISMOMETER COORDINATES (IN KM) 





















21 


arcess.crd 

NAME OF FILE FOR INVERSE MATR. SP. PARAMETERS 
insp.par 

NAME OF FILE FOR INVERSE MATR. SP. VALUES 
insp.mtrs 

READ/WRITE MODE: FROM/TO STACK = 1; FROM/TO DISK FILE = -1 
1 1 

DATA CHANNELS TO BE PROCESSED (IF READ FROM STACK) 

25 

OUTPUT PRESENTATION: AS TIME SERIES = 0; TRACE POWERS = 1 
0 

NAME OF FILE WITH DATA PARAMETERS (IF READ FROM DISK FILE) 
grfltfs.par 

NAME OF FILE WITH DATA SAMPLES (IF READ FROM DISK FILE) 
grfltfs .dat 

NAME OF FILE FOR SAVING OUTPUT (IF WRITE TO DISC FILE) 
grfltfs.out 


Explanation of program parameters 

1. ARRAY TYPE: 3C ARRAY = 1; 1C ARRAY = 2; 2C HORIZONTAL ARRAY = 
3; SIMA = 4 

The parameter defines the type of recording installation and can have the values: 

(1) - for a 3-component array or single 3-component station (if parameter NUMBER 
OF SEISMOMETERS equal 1); 

(2) - for 1-component array consisting of similarly oriented sensors; 

(3) - for 2-component horizontal array; 

(4) - for strain-seismometer microarray (SIMA), consisting of a single 3C seismometer 
and two horizontal strainmeters located at the same site. 

2. FILTER TYPE: UNDISTORTING(l) = 1; UNDISTORTING(2) = 2; 
WHITENING = 3 

The parameter defines the type of group filter to be used: 

(1) - for undistorting filter with single restriction (frequency response (FR) O/); 

(2) - for undistorting with restrictions on the FR spatial derivatives (FR <1> 2 ); 

(3) - for whitening group filter (FR Oj). 

3. PHASE TYPE: P = 1; SH & LOVE = 2; SV = 3; RAYLEIGH =4 

The parameter defines the type of seismic wave phase to be extracted. This program 
version intended for extracting only single phase with specific polarization: 

(1) - for the P-phase; 

(2) - for the SH or Love phases; 

(3) - for the SV phase; 

(4) - for the Rayleigh phase. 

4. INITIAL AND FINAL POINTS FOR SCANNING AT AZIMUTH (DEGREES 
FROM NORTH) 

The given program version allows to perform the optimal group filtering of 
3-component array data not only for the single arrival direction but for the “fan” of 
directions inside given ranges of azimuth and incidence angles of wave front (i.e. it 
provides scanning the medium with the assigned angle increments). The parameter 
assigns the initial and final values of wave arrival back azimuths (from recording site 
to the seismic source) to be used while scanning the medium. If the initial value is 
equal to the final one then the group filtering is performed for this single azimuth. 
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5. INITIAL AND FINAL POINTS FOR SCANNING AT INCIDENCE ANGLE 
(DEGREES FROM VERT.) 

The parameter assigns the initial and final values of wave arrival incidence angle 
(counted from lower perpendicular to the day surface) to be used while scanning the 
medium. If the initial value is equal the final one then the group filtering is performed 
for the single incidence angle. 

6 . INCREMENTS FOR SCANNING (DEGREES) 

These two parameters define the increments for scanning at azimuth and incidence 
angle. 

7. RAYLEIGH WAVE ELLIPTIC COEFFICIENT (HORIZ/VERT) 

This is the elliptic coefficient of the Rayleigh wave: the ratio of small axis of 
oscillation polarization to the large one; 

8 . MEDIUM PHASE VELOCITY (KM/SEC) 

This is the phase velocity of the P or S wave in the medium just beneath the array. If 
value of parameter “Name of file with dispersion curve” while filtering of the 
Rayleigh or Love waves consist of 5 blanks this velocity is used as the average surface 
phase velocity 

9. NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR 
SURFACE WAVE (IF NAME=' ' - WITHOUT DISPERSION) 

This is the name of file containing the surface wave dispersion cuive (the phase 
velocity as function of frequency). If the name contains the blanks at 5 first positions 
it means that the dispersion is absent and the surface wave is regarded as possessing 
the frequency independent (mean) phase velocity. The file has to consist of two 
ASCII format columns: the frequency in Hz and velocity in Km/Sec; 

10. LOW AND HIGH FREQUENCIES OF FILTER RANGE (HZ) 

These two parameters assign the low and high frequencies of the range of group 
filtering to be performed. 

11. NUMBER OF FREQUENCY BANDS 

The given program version allows to calculate simultaneously the group filtering of 
input traces for several (maximum 5) frequency bands inside the assigned frequency 
range. This parameter defines the number of bands 

12. MARGINS OF BANDS (HZ) 

The parameter is valid only if the NUMBER OF FREQUENCY BANDS is more 
then 1; it defines the inner margins of frequency bands for filtering. It is assumed in 
the program that the frequency bands do not overlap, low frequency of the first band 
and high frequency of the last one are equal correspondingly to the low and high 
frequencies of the filter range (the values of the above parameter). Thus the “margins 
of the bands” are the points of partition of the previously assigned filter frequency 
range. 

13. NOISE MATRIX TYPE: IDENTICAL = -1; FOR REJECTION FILTER - 0; 
ADAPTIVE - 1 

For design of any type optimal group filter it is needed to evaluate the inverse matrix 
power spectral density (IMPSD) F' ] (f) of array noise. The following options of the 
IMPSD can be used in the program, being specified by the parameter values: 

(-1) means the identity IMPSD corresponding the assumption that the noise field at 
the recording site is the spatially uncorrelated white one. The optimal group filtering 
coincides in this case with the conventional 3-component beamforming procedure. 

(0) means the IMPSD corresponding the coherent noise field generated by the 
random plane wave arriving to the recording site with an assigned direction. The 
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IMPSD is calculated in the program. In this case the group filtering procedure 
realizes the algorithm of spatial rejecting filtering. 

(1) means adaptive IMPSD. If noise field contains a strong coherent components of 
unknown genesis the both above assumptions about the noise IMPSD are not 
satisfactory. In this most common case the optimal group filtering has to involve an 
estimate of the array noise IMPSD. The latter should be preferably made using the 
array noise recordings at a time inteival just before the seismic event signal onset. The 
IMPSD estimate can be supplied by the program “armafs”. While using the 
parameter value = 1 this program have to be executed before running the program 
“grfltfs" The same array channel ordering must be guaranteed in the both program 
input data. 

14. NOISE WAVE DIRECTION: AZ. & INC. ANGLE; VELOCITY & WAVE 
TYPE (FOR REJECTION FILTERING) 

If the rejection filtering (NOISE MATRIX TYPE = 0) is chosen as optimal group 
filtering option, this parameter specifies the features of interfering plane wave: its 
arrival azimuth, incidence angle, phase velocity in the medium and the polarization 
type. 

15. NAME OF FILE WITH ARRAY SEISMOMETER COORDINATES (IN KM) 
The name of file with coordinates of array receivers (1C, 2C, 3C seismometers or 
SIMA). If the number of seismometers is equal to 1 then it’s site is assumed to have 
coordinates (0,0). The file must consist of two ASCII columns: first one with 
X-coordinates (West-East orientation) and second one with Y-coordinates (South- 
North orientation). In this version of the program the Z seismometer coordinates are 
assumed to be equal 0. 

16. NAME OF FILE FOR INVERSE MATR. SP. PARAMETERS 

This is the name of file containing parameters of the IMPSD estimate (if NOISE 
MATRIX TYPE = 1). 

17. NAME OF FILE FOR INVERSE MATR. SP. VALUES 

This is the name of file containing values of the IMPSD estimate (if NOISE 
MATRIX TYPE = 1). 

18. READ/WRITE MODE: FROM/TO STACK = 1; FROM/TO DISK FILE = -1 
The first of these two parameters defines the device for reading the program input 
data of and the second one - the device for saving results of the program performing: 
(1) is for the SNDA stack, (-1) is for the disk file. 

19. DATA CHANNELS TO BE PROCESSED (IF READ FROM STACK) 

This parameter is valid if the input data are read from the SNDA Stack. It must have 
a form of strings corresponding to CHANNELS parameter format used in the SNDA 
stack commands. For example, it can be as following: 15; (1,3-10,14); all. The total 
number of specified data channels has not to exceed 50. 

20. OUTPUT PRESENTATION: AS TIME SERIES = 0; TRACE POWERS = 1 
The presentation of filtering results is possible in two forms: 

(0) - in the form of output time series. In this case after the program execution a user 
gets the set of time series (traces). 

(1) - in the form of tables of filtered trace powers. For eveiy frequency band the table 
is produced. It is composed by the powers of group filter output traces corresponding 
to the all scanning directions. 

21. NAME OF FILE WITH DATA PARAMETERS (IF READ FROM DISK FILE) 
If parameter READ MODE—-1, then the name of file containing the parameters of 
input data has to be assigned here. The parameters are: number of data channels, 
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number of samples in every channel and the data sampling interval; the file must have 
the form of ASCII string. 

22. NAME OF FILE WITH DATA SAMPLES (IF READ FROM DISK FILE) 

If parameter READ MODE=-l, then the name of file containing the samples of data 
to be processed must be assigned here. The data must be in the ASCII multiplex 
format without any header. 

23. NAME OF FILE FOR SAVING OUTPUT (IF WRITE MODE=-l) 

If WRITE MODE=-l the parameter assigns the name of file for saving the filtering 
results to the disk. The output traces are saved in the ASCII demultiplex form. The 
every output trace has a header containing the filtering direction (azimuth and 
incidence angle) frequency band and the number of trace points. 

Input data format 

The input data for the program are to be the multidimensional time series with 
total number of channels no more then 50 and number of samples in the eveiy 
channel no more then 4096. If the data are stored in the disk file the latter must have 
the ASCII multiplex format. The channel ordering for eveiy array recording 
instrument has to be in generally the following: S-N strainmeter, W-E strainmeter, 
N-S seismometer, W-E seismometer, Z-seismometer. For a particular array 
configuration some of the channels above can be omitted. 

Output data format 

In depend on a value of parameter OUTPUT PRESENTATION the program 
output can have the following forms: 

(a) A set of time series which are the results of input data group filtering for 
the frequency bands specified by the parameter MARGINS OF BANDS and for the 
signal arrival directions specified by the scanning parameters. For eveiy direction 
there is the sequence of traces corresponding to the different bands. If output data are 
saved to the disk file they are written in the ASCII multiplex format with the header 
containing the filtering direction (azimuth and incidence angle), frequency band and 
the number of trace points. If output data are saved to the SNDA stack they are 
placed to the end of the stack and are followed by all needed information 

(b) A set of tables containing only the averaged powers of output traces. For 
every frequency band the table is composed by the powers of group filter outputs 
corresponding to the all scanning directions. The minimum and maximum values of 
the tables are calculated and the arrival directions are indicated at which these 
extremum values are attained. In this version of the program these tables are only 
displayed to the screen and do not saved into a file. 
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6.6. Program “GRFLTFCS” 

Extraction of waveforms of differently polarized seismic phases using 3-component 

array data with the help of optimal Wiener group filtering 


The program is designed for waveform extraction of the spatial oscillation 
components of a seismic wave field, registered by a 3-component array. The 
oscillations in three conventionally examined directions (‘longitudinal’ - along the 
wave propagation ray, ‘transversal’ - in the horizontal direction orthogonal to the ray, 
and ‘orthogonal’ - in vertical direction orthogonal to the ray) are extracted from the 
background of seismic noise with adaptive suppression of the noise coherent 
component with the help of multichannel Wiener filtering. The array data processing 
is performed in the frequency domain. 

Let the coordinate system has the X-axis directed to the East, 7-axis - directed 
to the North and Z-axis to the zenith and denoted as p—(p x , Py) T > p^sinasmfi/V, 
Py=cosa'Sin$/V, the vector of apparent velocities of the wave in the X and 7 
directions; here a is the back azimuth of wave arrival, (3 is its incidence angle and Vis 
the wave phase velocity in the medium just beneath the array. Let the registered wave 
field is a wavetrain composed by superposition of seismic phases with different 
polarization. If the all phases are generated by the same seismic source and eveiy 
phase anival direction can be characterized by the single apparent velocity vector p 
(the ray propagation approximation) then the multichannel signal registered by the 
array of 3-component seismometers can be expressed in the frequency domain by the 
equation: 


u(f)=v?(f)%B(f)-s(f), q(f)=(<Vi(f),.:, T , (p/J)=exp(-i2nftj),- 

where s(f)=(sp(f), sjffl, soffl) T ; sjffl, sjffl, so(f) are the Fourier transforms of 
longitudinal, transversal and orthogonal components of wave oscillations; xj(p) is the 
wave time delay at the site of y'-th seismometer (depending from the phase apparent 
velocity vector), <8> is the Rronecker product of the vector <p and (3x3) matrix B; M is 
the number of array seismometer's; matrix B(f) has the form: Bffl-AjAj. The matrix 
A i is equal 


A, = 


- sin a cos a 
-cosa -sina 
0 0 


0 

0 

1 
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It realizes the rotation of the coordinate system, associated with the seismic source 
(for which the X-axis is directed along the day surface to the source epicenter, the 
X-axis is directed to the zenith and K-axis is orthogonal to X and Z axes) to the 


geographical coordinate system introduced above. 

The matrix A 2 reflects our model of wave propagation in the medium in the 
vicinity of the array. For the simplest model without accounting for wave reflection 
and transformation at the day surface this matrix has the form 

sin p p 0 - cos P[/ 

A 2 = 0 1 0 

cos p p 0 sin P(/ 

where $ p h p v -incidence angles for P and S waves correspondingly. 

The model above is too simple to reflect all features of wave field registered by 
a 3-component array, so it is mainly relevant for investigation of surface waves (for 
which P/>=P y=n/2) and for preliminary (exploration) array data analysis because it 
does not consume much computational time. For comprehensive body wave 
investigations it is expedient to implement more complex but more realistic frequency 
independent propagation model proposed by D.Kennet, which reflects the impact of 
the day surface to body wave propagation. For this model the matrix .4? has the form: 

Vp ’ Ph' C 2 ^ ~^s ' Qs ' Q 
A 2 = 0 2 0 ; 

Vp -qp Cy 0 Kv ' Ph ' ( '2 


where Vp, V s are the phase velocities of the P and S waves accordingly; 

„ stj —2 2 > 1/2 . . _ S V -2 _2 A/2 . c _ 2 ~ V S 2 _0^_zllEkl 

q P =(V P ~ Ph)' , qs - (Vs ~ Ph) . c i ~ 2 9 _ 2 ) 2 . 4 „2 . _ 

(v s -1- p h ) + ^ ■ Ph ■ Qp ■ Qs 


_ 4- Ky 2 qp -q s _ 

(Vf-2-pl) 2 +A-p 2 h -q P -q s 


2 + 


In practical applications the seismic event signals are registered by an array at 
the background of additive seismic noise. So 3-component array recordings can be 
expressed by the following equation: u(f)=<p(f)<S>B(f)s(f)+C,(f), where t,(f) is Fourier 
transform of the multichannel array noise component. 

The group filter frequency response providing the optimal suppression of the 
noise component and the undistorting extraction of wave oscillations in the described 
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above longitudinal, transverse and the orthogonal directions is given by the following 
equation: 

<1 >i(f) = F l (f)H(f)(tt*(j) FHf) HO))- 1 , 

where <S?0) is (3Mx3) matrix function; F(J)=t 1 0)^ 0) is (3Mx3M) inverse noise matiix 
spectral density (IMPSD) of array noise recordings; HO)=<$(j)®BO) is 3Mx3 matrix. 
The filter transforms the 3M channel array seismogram to 3 output traces, eacli trace 
is composed by extracted wavetrain oscillations in the one of the three conventional 
directions. The single restriction used for filter design is the condition to 
undistortingly reproduce the oscillation components of signal arriving from the given 
direction. It is reflected in the following equation <t>i*(f)H(j)=(l,l,l) T 

One can easily check that the noise power spectral densities of the filter output 
traces are given by the equation 

o 2 0)=(o P ,a t,o o) r =((H(f)F- *0)H0)) ~ ] ) T 

Thus the group filter with frequency response d(f)=F(f)H(f)(H*0)F' 1 (f)H(f))~V 2 , 
produces at the eveiy direction output the white noise. Such a filter we call the 
whitening optimal group filter. 

If the seismic signal is recorded at a background of transient interference waves 
generated by the spatially localized source with known parameters, then the inverse 
noise MPSD can be found analytically: F~ I (f) = [I-Q(Q*Q)~ I Q*J, where / is the 
(3Mx3M) identity matrix, Q(f) is the (3Mx3) matrix frequency response of 
propagation paths of the interfering wave while it propagates from the noise source to 
the array sensors. For the plane interfering wave the matrix Q(f) has the same 
structure as the matiix H(f). The undistorting optimal group filter designed for this 
form of MPSD is known as spatially rejecting filter. 

Input parameters of the program 

All input parameters of the program have to be contained in the file 
“grfltfcs.inp”. Example of the file is given below: 

***FILE OF INPUT PARAMETERS FOR PROGRAM "GRFLTFCS" : standard*** 

ARRAY TYPE: 3C = 1; 2C HORIZONTAL = 2; SIMA = 3 
1 

FILTER TYPE: UNDISTORTING (1) = 1; WHITENING = 3 
1 

INITIAL AND FINAL POINTS FOR SCANNING AT W-E SLOWNESS (SEC/KM) 

-0.4 0.4 
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INITIAL AND FINAL POINTS FOR SCANNING AT S-N SLOWNESS (SEC/KM) 

-0.4 0.4 

INCREMENTS FOR SCANNING AT DIRECTIONS (SEC/KM) 

0.02 0.02 

MEDIUM MODEL: SIMPLEST (WITHOUT INTERACTIONS) = -1; FREQUENCY INDEPENDENT 
(D.KENNET) = 1 
“1 

P & S-WAVES MEDIUM PHASE VELOCITIES (KM/SEC) 

6. 4. 

NAME OF FILE WITH MEDIUM PHASE VELOCITY DISPERSION CURVE (FOR LOVE & 
RAYLEIGH WAVES) 

LOW AND HIGH FREQUENCIES FOR FILTERING RANGE (HZ) 

0. 5. 

NUMBER OF FREQUENCY BANDS 
3 

MARGINS OF BANDS (HZ) 

1. 1.5 3. 4. 

NOISE MATRIX TYPE: IDENTICAL^ -1; FOR REJECTION FILTER=0; ADAPTIVE=1 
1 

NOISE DIRECTION (E-W & N-S SLOWNESS) AND MEDIUM MODEL FOR REJECTION 

FILTERING 

0.4 0.4 “1 

NAME OF FILE WITH ARRAY SEISMOMETER COORDINATES (IN KM) 
alibek.crd 

NAME OF FILE FOR INVERSE MATR. SP. PARAMETERS 
insp.par 

NAME OF FILE FOR INVERSE MATR. SP. VALUES 
insp.mtrs 

OUTPUT PRESENTATION: AS FILTERED TIME SERIES = -1; AS POWER MAPS = 1 
“1 

READ/WRITE MODE: FROM/TO STACK = 1; FROM/TO DISK FILE = -1 
1 1 

DATA CHANNELS TO BE PROCESSED (IF READ FROM SYSTEM STACK) 

36 

NAME OF FILE WITH DATA PARAMETERS (IF READ FROM DISK FILE) 
grfltfs.par 

NAME OF FILE WITH DATA SAMPLES (IF READ FROM DISK FILE) 
grfltfs.dat 

NAME OF FILE FOR SAVING OUTPUT (IF WRITE TO DISC FILE) 
grfltfs.out 


Explanation of parameters 

1 . ARRAY TYPE: 3C = 1; 2C HORIZONTAL = 2; SIMA = 3 

The parameter defines the type of recording installation and can have the values: 

(1) - for a 3-component array or single 3-component station (if parameter NUMBER 
OF SEISMOMETERS equal 1); 

(2) - for 2-component horizontal array; 

(3) - for strain-seismometer microarray (SIMA), consisting of a single 3C seismometer 
and two horizontal strainmeters located at the same site. 

2. FILTER TYPE: UNDISTORTING(l) = 1; WHITENING = 3 
The parameter defines the type of group filter to be used: 

(1) - for undistorting filter with single restriction (frequency response (FR) <t>/); 
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(3) - for whitening group filter (FR <I>j). 

3. INITIAL AND FINAL POINTS FOR SCANNING AT W-E SLOWNESS 
(SEC/KM) 

The given program version allows to perform the optimal group filtering of 
3-component array data not only for the single arrival direction but for the “fan” of 
directions inside of given ranges of W-E and S-N slowness of wave arrival (i.e. it 
provides scanning of the medium with the assigned slowness increments). The 
parameter assigns the initial and final values of wave arrival W-E horizontal slowness 
to be used while scanning the medium. If the initial value is equal the final one then 
the group filtering is performed for the single W-E slowness value. 

4. INITIAL AND FINAL POINTS FOR SCANNING AT N-S SLOWNESS 
(SEC/KM) 

The parameter assigns the initial and final values of wave arrival S-N horizontal 
slowness to be used while scanning the medium. If the initial value is equal to the 
final one then the group filtering is performed for the single S-N slowness value. 

5. INCREMENTS FOR SCANNING AT DIRECTIONS (SEC/KM) 

These two parameters define the increments for scanning at W-E and S-N slowness 
of wave arrival. 

6. MEDIUM MODEL: SIMPLEST (WITHOUT WAVE TRANSFORM) - -1; 
FREQUENCY INDEPENDENT (D.KENNET) - 1 

The parameter defines the medium model used: 

(-1) corresponds to the simplest model not accounting for the day surface impact to 
the wave propagation; 

(1) corresponds to the frequency independent model proposed by D.Kennet 
accounting the wave reflections and transformations at the day surface. 

7. P & S-WAVES MEDIUM PHASE VELOCITIES (KM/SEC) 

These two parameters determine phase velocities of the P and S waves in the medium 
beneath the array. If the Rayleigh wave is studied the both value must have the same 
values equal to the Rayleigh wave phase velocity: Vp=Vs=V R 

8. NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR 
SURFACE WAVE (IF NAME-' ‘ - WITHOUT DISPERSION) 

This is the name of file containing the surface wave dispersion cuive (the phase 
velocity as function of frequency). If the parameter value contains the blanks at 5 first 
positions, it means that the dispersion is absent and the surface wave is regarded as 
possessing the frequency independent (mean) phase velocity. The file has to consist of 
two ASCII format columns: the frequency in Hz and velocity in Km/Sec; 

9. LOW AND HIGH FREQUENCIES OF FILTER RANGE (HZ) 

These two parameters assign the low and high frequencies of the range of group 
filtering to be performed. 

10. NUMBER OF FREQUENCY BANDS 

The given program version allows to calculate simultaneously the group filtering 
output traces for several (maximum 5) frequency bands inside the assigned frequency 
range. The parameter defines the number of bands 

11. MARGINS OF BANDS (HZ) 

The parameter is valid only if the NUMBER OF FREQUENCY BANDS is more 
then 1; it defines the inner margins of frequency bands for filtering. It is assumed in 
the program that the frequency bands do not overlap, low frequency of the first band 
and high frequency of the last one are equal correspondingly to the low and high 
frequencies of the filter range (the values of the above parameter). Thus the “margins 
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of the bands” are the points of partition of the previously assigned filter frequency 
range. 

12. NOISE MATRIX TYPE: IDENTICAL = -1; FOR REJECTION FILTER = 0; 
ADAPTIVE - 1 

For design of any type optimal group filter it is needed to evaluate the inverse matrix 
power spectral density (IMPSD) of array noise. The following options of the 

IMPSD can be used in the program, being specified by the parameter values: 

(-1) means the identity IMPSD corresponding the assumption that the noise field at 
the recording site is the spatially uncorrelated white one. The optimal group filtering 
coincides in this case with the conventional 3-component beamforming procedure. 

(0) means the IMPSD corresponding the coherent noise field generated by the 
random plane wave arriving to the recording site with an assigned direction. The 
IMPSD is calculated in the program. In this case the optional group filtering 
procedure realizes the algorithm of spatial rejecting filtering. 

(1) means adaptive IMPSD. If noise field contains a strong coherent components of 
unknown genesis the both above assumptions about the noise IMPSD are not 
satisfactory. In tills most common case the optimal group filtering has to involve an 
estimate of the array noise IMPSD. The latter should be preferably made using the 
array noise recordings at a time interval just before the seismic event signal onset. The 
IMPSD estimate can be supplied by the program “armafs”. While using the 
parameter value = 1 this program has to be executed before running the program 
“grfltfs”, The same array channel ordering must be guaranteed in the both program 
input data. 

13. NOISE WAVE DIRECTION: AZ. & INC. ANGLE; VELOCITY & WAVE 
TYPE (FOR REJECTION FILTERING) 

If the rejection filtering (NOISE MATRIX TYPE = 0) is chosen as optimal group 
filtering option, this parameter specifies the features of interfering plane wave: its 
arrival azimuth, incidence angle, phase velocity in the medium and the polarization 
type. 

14. NAME OF FILE WITH ARRAY SEISMOMETER COORDINATES (IN KM) 
The name of file with coordinates of array recording installations composing the array 
(1C, 2C, 3C seismometers or SIMA). If the number of seismometers is equal 1 then 
it’s site is assumed to has coordinates (0,0). The file must consist of two ASCII 
columns: first one with X-coordinates (West-East orientation) and second one with 
Y-coordinates (South-North orientation). In this version of the program the 
Z seismometer coordinates are assumed to be equal 0. 

15. NAME OF FILE FOR INVERSE MATR. SP. PARAMETERS 

This is the name of file containing parameters of the IMPSD estimate (if NOISE 
MATRIX TYPE =1). 

16. NAME OF FILE FOR INVERSE MATR. SP. VALUES 

This is the name of file containing values of the IMPSD estimate (if NOISE 
MATRIX TYPE =1). 

17. OUTPUT PRESENTATION: AS TIME SERIES = 0; TRACE POWER MAPS 
= 1 

The presentation of filtering results is possible in two forms: 

(0) - in the form of output time series. In this case after the program execution a user 
gets the set of time series (traces). 

(1) - in the form of filtered trace power maps. For eveiy frequency band and every 
wave oscillation component the map with coordinates X-slowness - Y-slowness is 
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produced. The map are composed by the powers of group filter output traces 
corresponding to the all scanning wave arrival directions. 

18. READ/WRITE MODE: FROM/TO STACK = 1; FROM/TO DISK FILE = -1 
The first of these two parameters defines the device for reading the program input 
data and the second one - the device for saving results of the program performing: (1) 
is for the SNDA stack, (-1) is for the disk file. 

19. DATA CHANNELS TO BE PROCESSED (IF READ FROM SNDA STACK) 
This parameter is valid if the input data are read from the SNDA Stack. It must have 
a form of strings corresponding to CHANNELS parameter format used in the SNDA 
stack commands. For example, it can be as following: 15; (1,3-10,14); all. The total 
number of specified data channels has not to exceed 50. 

20. NAME OF FILE WITH DATA PARAMETERS (IF READ FROM DISK FILE) 
If parameter READ MODE = -1, then the name of file containing the parameters of 
input data has to be assigned here. The parameters are: number of data channels, 
number of samples in eveiy channel and the data sampling inteival; the file must have 
the form of ASCII string. 

21. NAME OF FILE WITH DATA SAMPLES (IF READ FROM DISK FILE) 

If parameter READ MODE = -1, then the name of file containing the samples of 
data to be processed must be assigned here. The data has to be in the ASCII 
multiplex form without any header. 

22. NAME OF FILE FOR SAVING OUTPUT (IF WRITE MODE = -1) 

If WRITE MODE= -1 the parameter assigns the name of file for saving the filtering 
results to the disk. The output traces are saved in the ASCII demultiplex form. The 
every output trace has a header containing the filtering direction (X and Y slowness), 
frequency band and the number of trace points. 

Input data format 

The input data for the program are to be the multidimensional time series with 
total number of channels no more then 50 and number of samples in the every 
channel no more then 4096. If the data are stored in the disk file the latter must have 
the ASCII multiplex format. The channel ordering for eveiy array recording 
instrument has to be in generally the following: S-N strainmeter, W-E strainmeter, 
N-S seismometer, W-E seismometer, Z-seismometer. For a particular array 
configuration some of the channels above can be omitted. 

Output data format 

In depend on a value of parameter OUTPUT PRESENTATION the program 
output can have the following forms: 

(a) A set of time series which are the results of input data group filtering for 
the frequency bands specified by the parameter MARGINS OF BANDS and for the 
signal arrival directions specified by the scanning parameters. For eveiy direction 
there is the sequence of traces corresponding to the different bands. If output data are 
saved to the disk file they are written in the ASCII multiplex format with the header 
containing the filtering direction (X, Y slowness), frequency band and the number of 
trace points. If output data are saved to the SNDA stack they are placed to the end of 
the stack and are followed by all needed information 

(b) A set of maps in coordinates X-slowness - Y-slowness containing the 
averaged powers of output traces (F-K maps). Such maps (composed by the powers 
of group filter outputs corresponding to the all scanning directions) are produced for 
eveiy component of wave oscillation and every frequency band. The minimum and 
maximum values of the maps are calculated and the arrival directions are indicated at 
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which these extremum values are attained. The maps are saved into a disc file and 
displayed to the screen with the help of SNDA “ Surfer ” routine. 

6.7. Program “GRFLTFK” 

Adaptive 3-component F-K analysis 

Program is intended for adaptive estimation of the horizontal slowness vector 
of seismic wave phase using the observations from a 3-component (3C) 2-component 
or 1-component (1C) seismic array. The following model of the multidimensional 
time series at the output of 3C array sensors is used (being written in the frequency 
domain) 

x(f) = y(f) + \(f) = K(f,P, V)s w (f) + \(f) ( 1 ) 

where x(f)= z (x 1 (f),..,X 3 M(f)) T is 3M dimensional vector of array observations in 
the frequancy domain; \(j) is the 3M dimensional vector of additive array 
noise component, h w (f,p, V) is the 3M dimensional vector frequency response 
VFR of the medium beneath the array which converts the complex spectrum 
s w (J) of wave phase oscillations to oscillations at the outputs of of 3M array 
sensors. The equation for the VFR calculations based on two different models 
of the laterally homogeneous medium are given in Section 3.1 (see also in the 
description of program “grfiltfs” ). Note that the type of analyzed wave 
polarization directly defines the form of the function h w (f,p, V), along with the 
value of the wave velocity in the medium and array sensor coordinates. 

In general case the array is assumed as consisting of M 3-component 
seismometers with N-S, E-W, Z, sensors. In case of a Z-component array or 
an array consisting of horizontal N-S, E-W seismometes some of the sensors 
are regarded as being off and the corresponding elements in the all vectors of 
eq.(l) are substituted by zeros. For the adaptive mode of program execution 
the additional assumption is used that additional observation of the “pure” 
noise realization t, 0 (f) is available. Usially this is some noise array recordings at 
time intervals preceding or sucseeding the interval containing the signal wave 
phase being analized. It is supposed, that the current seismic noise field is 
rather stationary one and the noise [3Mx3M] matrix power specrtrum density 
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(MPSD) F 0 (f) calculated using “pure” noise realization t, 0 (f) with the help of 
program “armafs” well corresponds to the MPSD of noise realization %(/) 
obsquring the signal waveforms at the array sensors. 

As it is shown in Section 3.3 under assumptions above the statistically 
optimal (Maximum Likelihood) estimate of wave phase apparent slowness 
vector p ~(p n ,Pe) can be calculated in acoordance with the following formula 


j max ,* 

V Av 

p = arg max > — 

P j=jmin 


/C(f, ,p, V) J-Q 1 (fj )x(f i )x (f, )F 0 1 (fj )x w (f,, p, V) 

(fj, V)/^' 1 (fj Ar (fj. -P> V) 


where jc (fj) are the complex spectral components of array observations by eq.(l) in the 


frequency points of the Discrete Fourjer Transform (DFT), j m f n and j max - are the 

DFT points defined the frequency band of the F-K analysis 

The program realizing the ML algorithm of apparent slowness vector 
estimation creates the map consisting of values of functionalcorresponding to p n , p e 


slowness, in the assigned range of the wave arrival directionsi.e. Maximum value of 
the map corresponds to the ML estimate of apparent slowness vector of wave arrival. 

The program “ grfiltfk” can also work in the nonadaptive mode. In this case 
the two variants exist. For the first variant the noise MPSD is supposed to be identity 


matrix: F 0 (f) = L This corresponds to the assumption that the array noise is originated 
from the spatially uncorrelated white noise field. In this case the ML apparent 
slowness estimate is coinsides with the conventional 3-component wide band F-K 


analysis. For the second variant the assumption is made that the array noise is pure 
coherent one and generated by transient nuisance wave with the known apparent 
slowness vector, polarization type and phase velocity in the medium. The MPSD 


F 0 (j) corresponding to this nuisance wave is calculated in the program using above 
parameters by equations for kernel matrix of rejection group filter (see eq.(12-(14) in 
Section 3.2.4). 

The program “grfiltfk” allows to calculate a series of maps corresponding to 
different frequency bands. This provides the possibility to analyze the wave arrival 
direction parameters in depending on frequency, for example to calculate the 
dispersion curves of the teleseismic surface waves. 
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Input parameters of the program 

All parameters of the program are to be contained in a disk file with the name 
“grfiltfk.inp”. An example of the file is given below. 

*** FILE OF INPUT PARAMETERS FOR PROGRAM "GRFILTFK" : standard *** 

ARRAY TYPE: 3C ARRAY=1; 1C ARRAY=2; 2C HORIZONTAL ARRAY=3; SSI=4; 

2 

FILTER TYPE: UNDISTORTING(1)=1; UNDISTORTING(2)=2; WHITENING=3 
3 

WAVE TYPE: P=l; SH & LOVE=2; SV=3; RAYLEIGH=4; 

1 

IMAGE: CONTOUR MAP=0; 3D IMAGE=1; BOTH=2 
2 

PLOTTING: FROM PROGRAM=0; FROM SCRIPT=1 
0 

INICIAL AND FINAL POINTS FOR SCANNING AT E-W-SLOWNESS (SEC/KM) 

- 0.2 0.2 

INICIAL AND FINAL POINTS FOR SCANNING AT N-S SLOWNESS (SEC/KM) 

- 0.2 0.2 

INCREMENTS FOR SCANNING AT SLOWNESS (SEC/KM) 

0.02 0.02 

RAYLEIGH WAVE ELLIPTIC COEFFICIENT (HORIZ/VERT) 

0.8 

MEDIUM PHASE VELOCITY (KM/SEC) 

5. 

NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR SURFACE WAVE (IF 
NAME= 1 1 - WITHOUT DISPERSION) 

LOW AND HIGH FREQUENCIES FOR GROUP FILTERING (HZ) 

1. 3. 

NUMBER OF FREQUENCY BANDS 
1 

MARGINS OF BANDS (HZ) 

3. 4. 

CALCULATION OF AVERAGED MAP FOR DIFFERENT FREQUENCES: YES=1; NO=0 
1 

NOISE MATRIX TYPE: IDENTICAL=-1; CALC. FOR REJECTION FILTER=0; 

ADAPTIVE=1; 

1 

NOISE DIRECTION (EW & NS SLOWN.), VELOCITY AND WAVE TYPE (FOR REJECTION 
FILTER) 

0.2 0.2 3. 1 

NAME OF WITH ARRAY COORDINATE (COORDINATES IN KM) 
data/lapshin/noress .crd 

NAME OF FILE FOR INOISE INVERSE SPECTRUM PARAMETERS 
ssa/lapshin/insp.par 

NAME OF FILE FOR NOISE INVERSE MATRIX SPECTRUM 
ssa/lapshin/insp.mtrs 

READ/WRITE MODE: FROM/TO SYSTEM STACK=1; FROM/TO DISC FILE=-1 
1 1 

DATA CHANNELS TO BE PROCESSED (IF READ FROM SYSTEM STACK) 

25 

NAME OF FILE WITH DATA PARAMETERS (IF READ FROM DATA FILE) 
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ssa/lapshin/data/grfltfs .par 

NAME OF FILE WITH DATA SAMPLES (IF READ FROM DATA FILE) 
ssa/lapshin/data/grfltfs .dat 

Explanation of the input parameters 

1. ARRAY TYPE: 3C ARRAY = 1; 1C ARRAY = 2; 2C HORIZ. ARRAY = 3; 
SIMA = 4 

The parameter defines the type of recording installation and can have the values: 

1 - for a 3-component anay or single 3-component station 

2 - for 1-component array consisting of similarly oriented sensors; 

3 - for 2-component horizontal anay; 

4- for strain-seismometer microarray (SIMA), consisting of a single 3C seismometer 
and two horizontal strainmeters located at the same site. 

2. FILTER TYPE: UNDISTORTING(l) = 1; UNDIST0RTING(2) - 2; 
WHITENING = 3 

The parameter defines the type of group filter to be used: 

(1) - for undistorting filter with single restriction (frequency response (FR) O/); 

(2) - for undistorting with restrictions on the FR spatial derivatives (FR <£> 2 ); 

(3) - for whitening group filter (FR <bj). 

3. WAVE TYPE: P = 1; SH & LOVE = 2; SV = 3; RAYLEIGH =4 

The parameter defines the type of seismic wave phase to be extracted. This program 
version intended for extracting only single phase with specific polarization: 

(1) - for the P-phase; 

(2) - for the SH or Love phases; 

(3) - for the SV phase; 

(4) - for the Rayleigh phase. 

4. IMAGE: CONTOUR MAP=0; 3D IMAGE=1; BOTH-2 

The parameter defines the type of program output control file which is used by the 
utility providing the plotting of F-K map. A user can choos one (or both) from the 
two specified programs: the standard UNIX graphic routine “contour ” or the SNDA 
graphic program “surfer ” 

5. PLOTTING: FROM PROGRAMS; FROM SCRIPT-1 

The parameter allows to start the execution of the graphic utility for the F-K map 
imaging inside the program “grfiltjk ”or to transfer this function to the special SNDA 
command. 

6. INICIAL AND FINAL POINTS FOR SCANNING AT E-W-SLOWNESS 
(SEC/KM) 

The parameter assigns the initial and final values of wave W-E horizontal slowness to 
be used while scanning the medium for creating the F-K map. 

7. INITIAL AND FINAL POINTS FOR SCANNING AT N-S SLOWNESS 
(SEC/KM) 

The parameter assigns the initial and final values of wave S-N horizontal slowness to 
be used while scanning the medium for creating the F-K map. 

8. INCREMENTS FOR SCANNING AT DIRECTIONS (SEC/KM) 

These two parameters define the increments for scanning at W-E and S-N slowness 
of wave arrival while creatinf the F-K map. 

9. RAYLEIGH WAVE ELLIPTIC COEFFICIENT (HORIZ/VERT) 

This is the elliptic coefficient of the Rayleigh wave: the ratio of small axis of 
oscillation polarization to the large one; 

10. MEDIUM PHASE VELOCITY (KM/SEC) 
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This is the phase velocity of the P or S wave in the medium just beneath the array. If 
value of parameter “Name of file with dispersion curve” while filtering of the 
Rayleigh or Love waves consist of 5 blanks this velocity is used as the average surface 
wave velocity 

11. NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR 
SURFACE WAVE (IF NAME-’ ' - WITHOUT DISPERSION) 

This is the name of file containing the surface wave dispersion cuiwe (the phase 
velocity as function of frequency). If the name contains the blanks at 5 first positions 
it means that the dispersion is absent and the surface wave is regarded as possessing 
the frequency independent (mean) phase velocity. The file has to consist of two 
ASCII format columns: the frequency in Hz and velocity in Km/Sec; 

12. LOW AND HIGH FREQUENCIES OF F-K ANALYSIS RANGE (HZ) 

These two parameters assign the low and high frequencies of the range wihin which 
the F-K analysis is to be performed. 

13. NUMBER OF FREQUENCY BANDS 

The given program version allows to calculate simultaneously the F-K analysis of 
input traces for several (maximum 5) frequency bands inside the assigned frequency 
range. This parameter defines the number of bands 

14. MARGINS OF BANDS (HZ) 

The parameter is valid only if the NUMBER OF FREQUENCY BANDS is more 
then 1; it defines the inner margins of frequency bands for filtering. It is assumed in 
the program that the frequency bands do not overlap, low frequency of the first band 
and high frequency of the last one are equal correspondingly to the low and high 
frequencies of the filter range (the values of the above parameter). Thus the “margins 
of the bands” are the points of partition of the previously assigned filter frequency 
range. 

15. CALCULATION OF AVERAGED MAP FOR DIFFERENT FREQUENCES: 
YES-1; NO—0 

Theis is the switch to the option for computing the averaged F-K map composed 
from the set of maps previousely calculated for different frequency bands. 

16. NOISE MATRIX TYPE: IDENTICAL = -1; FOR REJECTION FILTER - 0; 
ADAPTIVE - 1 

For creating of the adaptive F-K map it is needed to evaluate the inverse matrix 
power spectral density (IMPSD) R ] (f) of array noise. The following options for the 
IMPSD can be used in the program, being specified by the parameter values: 

(-1) means the identity IMPSD corresponding the assumption that the noise field at 
the recording site is the spatially uncorrelated white one. The adaptive F-K analysis 
coincides in this case with the conventional (3-component) wide band F-K analysis. 

(0) means the IMPSD corresponding to the coherent noise field generated by the 
nuisance plane wave arriving to the recording site with an assigned direction. The 
IMPSD is calculated in the program. 

(1) means adaptive IMPSD. If the noise field contains a strong coherent components 
with unknown genesis the both above assumptions about the noise IMPSD are not 
satisfactoiy. In this most common case the optimal group filtering has to involve an 
estimate of the array noise IMPSD. The latter should be preferably made using the 
array noise recordings at a time interval just before the seismic signal onset. The 
IMPSD estimate can be supplied by the program “armafs”. While using the 
parameter value = 1 this program have to be executed before running the program 
“grfiltfk”. The same array channel ordering must be guaranteed for both program 
input data. 
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17. NOISE DIRECTION (EW & NS SLOWN.), VELOCITY AND WAVE TYPE 
(FOR REJECTION NOISE SUPPRESSION) 

If the NOISE MATRIX TYPE = 0 is chosen as the option for adaptive F-K anlysis, 
this parameter specifies the features of interfering plane wave: its apparent slowness 
vector, wave velocity in the medium and the polarization type. 

18. NAME OF FILE WITH ARRAY SEISMOMETER COORDINATES (IN KM) 
This is the name of file with coordinates of array receivers (1C, 2C, 3C seismometers 
or SIMA). If the number of seismometers is equal to 1 then it’s site is assumed to 
have coordinates (0,0). The file must consist of two ASCII columns: first one with 
X-coordinates (West-East orientation) and second one with Y-coordinates (South- 
North orientation). In this version of the program the Z seismometer coordinates are 
assumed to be equal 0. 

19. NAME OF FILE FOR INVERSE MATRIX SPECTRUM PARAMETERS 
This is the name of file containing the parameters of IMPSD estimate (if NOISE 
MATRIX TYPE =1). 

20. NAME OF FILE FOR INVERSE MATRIX SPECTRUM VALUES 

This is the name of file containing the values of IMPSD estimate (if NOISE 
MATRIX TYPE = 1). 

21. READ/WRITE MODE: FROM/TO STACK = 1; FROM/TO DISK FILE = -1 
The fust of these two parameters defines the device for reading of the program input 
data; the second one - the device for saving results of the program performing: (1) is 
for the SNDA stack, (-1) is for the disk file. 

22. DATA CHANNELS TO BE PROCESSED (IF READING FROM STACK) 

This parameter is valid if the input data are read from the SNDA Stack. It must have 
the form of string corresponding to CHANNELS parameter format used in the 
SNDA stack commands. For example, it can be as following: 15; (1,3-10,14); all. The 
total number of specified data channels has not to exceed 50. 

23. NAME OF FILE WITH DATA PARAMETERS (IF READING FROM DISK 
FILE) 

If parameter READ MODE — -1, then the name of file containing the parameters of 
input data has to be assigned here. The parameters are: number of data channels, 
number of samples in eveiy channel and the data sampling interval; the file must have 
the form of ASCII string. 

24. NAME OF FILE WITH DATA SAMPLES (IF READING FROM DISK FILE) 
If parameter READ MODE = -1, then the name of file containing the samples of 
data to be processed must be assigned here. The data must be in the ASCII multiplex 
format without any header. 


6.8. Program SP3C 

Adaptive multimode F-K analysis of 3-component array data 

The program is intended for adaptive estimating of slowness vectors of the 
seismic waves using observations of 3-component (3C) seismic arrays. In the 
nonadaptive mode the program allows to calculate estimates of spatial spectra of array 
recordings using the conventional low resolution and different high resolution 
methods. In the adaptive mode it provides the low resolution estimation of spatial 
spectrum of “signal” wave with suppression of noise spatial spectral components. The 
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last program functions coincides with the puiposes of the program “grfltjk”, which is 
really the theoretical prototype of the program “sp3c”. 

The necessity of program “sp3c” development is due to the fact that 
performance of the prototype program “grfltjk” is rather time consuming. As it is 

seen from eq.(l) of the program “grjltfk” description there is needed to calculate and 
invert the MxM matrices (where M is a number of the array sensors) for the all 
Discrete Fouijer Transform (DFT) frequencies of the seismic range being analyzed 
e.g. to perform the thousands computational cycles. The program “ sp3C” is based on 
an approximate formula for the calculations of multimode adaptive spatial spectrum. 
It runs in tens times faster than the prototype program “grfltjk” . 

The modifying of calculating algorithm is connected with the next 
assumptions. Let frequency band being analyzed is quite narrow, and we can neglect 
variations of the functions h(fj,p, V) and F 0 ~ ] (fj) in this frequency band. Then equation 
(1) in the description of program “grfltjk” can be transformed in the following form: 


j~ j max 

h * (.fo,P,V)Fo 1 (f 0 )[ Y, x(f j )x * (fj)l F o l (fo)ft(fo,P, v ) 


p = eng max 
P 


_ j=j min _ 

h * (fo,P,V)Fo 1 (f 0 )h(f Q ,p,V) 



where fo - is a frequency in the middle of the band being analyzed; jmin, jmax are 
the lower and upper DFT frequencies in the band. The expression in the square 
brackets is the averaging of matrix periodogramm of 3M vector array obsen'ations in 
the frequency domain. This averaged matrix is really the estimate of MPSD of the 
being analyzed array seismograms containing the signal wave obscured by array noise 
components. We can substitute instead the expression in the square brackets other 
more statistically grounded estimate Fyffo) of the averaged MPSD of the signal 
observations in the given frequency band. Remind, that the matrix Fjfo) in eq.(l) is 
the MPSD of the “pure” array noise calculated during adaptation using “pure” array 
noise recordings. This is performed in the program “sp3c” based on matrix 
coefficients of the noise ARMA model produced by the program “marmamo” at the 
stage of adaptation to noise. 

So the computational formula for narrow band adaptive F-K estimate of 
apparent slowness vector has the form: 
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, .. h * (f 0 ,P,V)F 0 - l (f 0 )F x (f 0 )F 0 - 1 (f 0 )h(f 0 ,p,V) 

p = aiv max -----——— 

P h*(fo,P,V)Fo l (fo)h(fo,P,V) 

( 2 ) 

As for the program “ grfiltfk ” in the case of 1-component or two-component 
(with horizontal sensors) array the matrices Fyff) and F(ff) will have the dimensions 
[MxM] or [2Mx2MJ correspondingly. Note that the general formula eq.(2) has 
numerous partial cases that allow to design the program providing multimode F-K 
analysis: 1) adaptive low resolution, 2) conventional low resolution and high 
resolution with ARMA modeling for the estimation of signal data MPSD Fyff). In the 
case 1) calculations is performed using general eq.(2). In the case 2) matrix Fq~ ] (/) is 
substituted in eq.(2) by the identity matrix I; this implies that the denominator of 
eq.(l) become equal to 1 and eq.(2) itself coincides with the conventional formula for 
narrow band low resolution F-K analysis. In the case 3) the matrix Fq 1 (J) in eq.(2) is 
substituted by the matrix Fy(f)\ this implies that the nominator of eq.(2) become equal 
to 1 and eq.(2) itself coincides with the conventional formula of Capon high 
resolution F-K analysis. All this options are realized in the program “sp3c”. 

Note that in any described modes the matrix functions FfJ) and Fq 1 (J) are 
calculated in the program based on the matrix coefficients of ARMA models of the 
signal and noise time series. The latter are produced by the program “tnarmamo” 
which have to be executed before running the program “sp3c’\ moreover, for the 
adaptive mode this program have to process the signal and the noise data separately. 
The matrix coefficients of the data ARMA models transferred to the program “sp3c” 
via disc files with the standard names. So this mutual execution of these two programs 
is not so borrowing for a user, especially if F-K analysis processing is performed with 
the help of special SNDA script. 

Input parameters of the program 

All parameters of the program are to be contained in a disk file with the name 
“sp3c.inp”. An example of the file is given below. 

*** FILE OF INPUT PARAMETERS FOR PROGRAM "sp3c" : standard *** 

TYPE OF ANALYSIS: LOW RESOLUTION = 0; HIGH RESOLUTION = 1 
1 

ARRAY TYPE: 3C ARRAY =1; 1C ARRAY = 2; 2C HORIZONTAL ARRAY = 3; SSI = 4 
1 

WAVE TYPE: P = 1; SH & LOVE = 2; SV = 3; RAYLEIGH = 4 
IMAGE: CONTOUR MAP = 0; 3D IMADE = 1; BOTH = 2 
2 

PLOTTING: FROM PROGRAM = 0; FROM SCRIPT = 1 
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0 

ADAPTIVE MODE = 1; WITHOUT ADAPTATION = 0; 

1 

INICIAL AND FINAL POINTS FOR SCANNING AT E-W-SLOWNESS (SEC/KM) 

- 0.2 0.2 

INICIAL AND FINAL POINTS FOR SCANNING AT N-S SLOWNESS (SEC/KM) 

- 0.2 0.2 

INCREMENTS FOR SCANNING AT SLOWNESS (SEC/KM) 

0.02 0.02 

RAYLEIGH WAVE ELLIPTIC FACTOR (HORIZ/VERT) 

0.7 

MEDIUM PHASE VELOCITY (KM/SEC) 

6 . 

NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR SURFACE 
WAVE (IF NAME = 1 1 - WITHOUT DISPERSION) 

disp.crv 

LIST OF FREQUENSY VALUES (<=10 in the limits of data freq band) , HZ 
1.5 0 0 2.4 0 0 0 0 0 0 

PLOTING SUM OF MAPS FOR DIFFERENT FREQUENCES: YES - 1; NO - 0 (ISUM) 

0 

NOISE REGULATIZ AT ION VALUE: QUANTATY IN [0,1], REAL NOISE: 0, WHITE 
NOISE: 1 
0 

Explanation of the input parameters 

1. TYPE OF ANALYSIS: LOW RESOLUTION = 0; HIGH RESOLUTION = 1 
The parameter specifies the method of high or low resolution FK-analysis. Note that 
this parameter is valid only for nonadaptive mode. If the method “Low resolution is 
selected, then the spectrum matrix Fo^Cf) in eq.(2) is replaced by the identity matrix. 
If the method “High resolution” is selected, then the spectrum matrix Fo' J (0 in eq.(2) 
is replaced by the Fyff), thus the only denominator of eq.(2) with this matrix 
substituted is computed. 

2. ARRAY TYPE: 3C ARRAY = 1; 1C ARRAY = 2; 2C HORIZONTAL ARRAY = 
3; SSI = 4 

The parameter determines the structure of seismic recording system. The possible 
variants are: array consisting of 3C-seismometers, array of lC-veitical seismometeis, 
array of 2C-(vertical and horizontal) seismometers, the recording system SSI 
consisting of single 3C-seismometer and 2C horizontal strainmeter. 

3. WAVE TYPE: P = 1; SH & LOVE = 2; SV = 3; RAYLEIGH = 4 

The parameter allows to select the wave phase type which specific polarization will be 
used in the F-K analysis. 

4. IMAGE: CONTOUR MAP - 0; 3D IMADE = 1; BOTH = 2 

The parameter defines the type of program output control file which is used by the 
utility providing the plotting of F-K map. A user can choos one (or both) from the 
two specified programs: the standard UNIX graphic routine “contour ” or the SNDA 
graphic program “surfer ” 

5. PLOTTING: FROM PROGRAM-O; FROM SCRIPT=1 

The parameter allows to start the execution of the graphic utility for the F-K map 
imaging inside the program “grfiltfk” or to transfer this function to the special SNDA 

command. 























41 


6. ADAPTIVE MODE = 1; WITHOUT ADAPTATION - 0; 

The parameter swithes the program to the modes of conventional low resolution or 
high resolution F-K analysis (without adatation) or to the adaptive F-K analysis 
mode. In the latter case the parameter “TYPE OF ANALYSIS” must be set to the 
“Low resolution” mode. 

7. INICIAL AND FINAL POINTS FOR SCANNING AT E-W-SLOWNESS 
(SEC/KM) 

The parameter assigns the initial and final values of wave W-E horizontal slowness to 
be used while scanning the medium for creating the F-K map. 

8. INITIAL AND FINAL POINTS FOR SCANNING AT N-S SLOWNESS 
(SEC/KM) 

The parameter assigns the initial and final values of wave S-N horizontal slowness to 
be used while scanning the medium for creating the F-K map. 

9. INCREMENTS FOR SCANNING AT DIRECTIONS (SEC/KM) 

These two parameters define the increments for scanning at W-E and S-N slowness 
of wave arrival while creatinf the F-K map. 

10. RAYLEIGH WAVE ELLIPTIC COEFFICIENT (HORIZ/VERT) 

This is the elliptic coefficient of the Rayleigh wave: the ratio of small axis of 
oscillation polarization to the large one; 

11. MEDIUM PHASE VELOCITY (KM/SEC) 

This is the phase velocity of the P or S wave in the medium just beneath the array. If 
value of parameter “Name of file with dispersion curve” while filtering of the 
Rayleigh or Love waves consist of 5 blanks this velocity is used as the average surface 
wave velocity 

12. NAME OF FILE WITH PHASE VELOCITY DISPERSION CURVE FOR 
SURFACE WAVE (IF NAME-' ’ - WITHOUT DISPERSION) 

This is the name of file containing the surface wave dispersion curve (the phase 
velocity as function of frequency). If the name contains the blanks at 5 first positions 
it means that the dispersion is absent and the surface wave is regarded as possessing 
the frequency independent (mean) phase velocity. The file has to consist of two 
ASCII format columns: the frequency in Hz and velocity in Km/Sec; 

13. LIST OF FREQUENSY VALUES (<=10 in the limits of data freq band), HZ 
The parameter deterimines the set of central frequencies of narrow band F-K analysis. 
The F-K maps are calculated for frequency bands around these central frequencies. 
The width of bands depends on parameters of array data ARMA model assigned in 
the program “marmamo” and transferred to the program sp3c ” via the file with 
standard name. 

14. CALCULATION OF AVERAGED MAP FOR DIFFERENT FREQUENCES: 
YES=1; NO=0 

This is the switch to the option for computing the averaged F-K map composed from 
the set of maps previousely calculated for different frequency bands. 

15. NOISE REGULATIZATION VALUE: QUANTATY IN THE LIMITS [0,1]: 
FOR REAL NOISE = 0, FOR WHITE NOISE = 1 

The parameter allows to relax the noise suppression in the adaptive F-K mode. If it 
hase the value equal to 0, the calculations are made in accordance with eq.(2). For 
other values noise matrix Fo'^O) is substitutes in eq.(2) by the matrix 
O = (I-p)Fffl(f) + pi, where / is the identity matrix, p is the parameter value. 

























